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PREFACE 


The Science of Statistics occupies an important place in 
modern knowledge. It has a special significance in underdeveloped 
economies weded to planned economic development. It is due 
to this that Statistics has become a big subject, and its study 
is included in the courses of Arts, Science, Commerce, Agriculture, 
Medicine etc. in almost all the universities. There is ever- 
increasing number of students offering this subject. The present 
book is intended to meet the requirements of the students pre- 
paring for М.А. (Eco.), M. Com., B.A. (Econ. Hons), B. Com., 
B.Sc. and professional examinations conducted by the Universities 


and other Examining bodies of the country. 


The book covers theoretical, practical, and applied aspect 
of Statistics. It contains almost every important matter within 
the reasonable compass that a student of statistics, a wide subject, 
would require. I have attempted to give as many illustrations 
as possible in order to make the students understand various 
typical problems. I hope that the book will prove useful to the 
student community. I claim no originality for this book, but the 
method of presentation and the arrangement of the subject matter 
is my own, which I developed while teaching the subject to 
M.Com. and B.Com. classes when I was posted at Maharani 
Laxmibai College, Gwalior (formerly known as Victoria College, 
Gwalior). 

I hope that the book will be liked by teachers-friends, from 
whom I also invite suggestions for the betterment of this book 
in future, which will be accepted with gratitude. In the end, 
I would like to thank the publishers for undertaking the task 
of publishing the book. 


June, 1963 B. N. Gupta 
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CHAPTER 1: 


INTRODUCTION 


"Statistical thinking will one day be as necessary for efficient 
citizenship as the ability to read and write." 


Н. С. Wzrrs 


Nature of the Subject. Statistics is not а body of 
substantive knowledge, but a body of methods for obtaining 
knowledge. Knowledge obtained with the help of statistical 
methods and supported by numerical facts is accurate and 
precise. Lord Kelvin remarked that, “when you can measure 
what you are speaking about and express it in numbers you 
know something about it, but when you cannot measure it, when 
you cannot express it in numbers, your knowledge is of meagre 
and unsatisfactory kind." Such an important place is occupied 
by Statisties. The quantitative data substantiate knowledge. 
The methods which are used in acquiring knowledge with the 
help of numerical facts are called ‘Statistical Methods’. 
Statistical methods are used in the collection, analysis and 
interpretation of quantitative data. 

Origin and Development of Statistics, The term ‘Statistics’ 
has been derived fram the Latin word ‘Status’ or Italian word 
‘Statista? or Germen word 'Statistek' These words mean, 
“political state’ or the statesman’s art. In this sense, this word 
is found used in the seventeenth century in the writings of 
Shakespeare and Milton. Shakespeare used this word in his 
famous drama ‘Hamlet? and Milton used this word in his 
famous epic ‘Paradise Regained’. W. Hooper in 1770 used this 
word in his translation of Von Biefields’ ‘Element of Universal 
Erudition’ and he defined statistics in that book as, “The science 
that teaches us what is the Political arrangement of all the 
modern states of the known world.” 

с There were two main reasons which are responsible for 
the growth and development of statistics. They were— 
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(a) Governments of different states used to collect statistics | 
for evaluating their military strength and taxable capacity ` 
of their subjects. Such statistics were collected in Коше, 
England, Germany, India, Egypt and in all other older countries. | 
This is the reason for describing Statistics as the 'Science of 
kings’ or ‘Science of Statecraft?. William Petty regarded it 
as ‘Political arithmetic’. 

(b) The gamblers of the seventeenth century were also] 
responsible for the early development of Statistics. These! 
gamblers diverted the attention of such men as Galileo, Pascal, 
de Мете, Fermet and Cardano. These early beginnings resulted. 
in a theory of probability that is still being expanded and 
refined by mathematicians and forms the basis of modern 
statistics. Brilliant mathematicians like James Bernoulli and his 
nephew Daniel Bernoulli, Gauss, Laplace ete., while calculating 
the chances of winning in gambling, gradually discovered and 
developed the theory of probability. Similarly another mathe- 
matician of England De-Moivre discovered the theory of 
normal distribution on which a large part of modern statistical 
theories is based. The great Belgian mathematician, Quetelet 
grasped the significance of one of the fundamental principles 
called ‘the constancy of great numbers’. Sir Francis Galtan, a 
cousin of Charles Darwin, Karl Pearson, Knapp, Lexis, 
Edgeworth, A. L. Bowley, Fisher and several other noted 
statisticians have contributed a lot. towards the development of 
this science, and it is due to their genius that statistics has 
reached the present stage. Karl Pearson and Fisher have. done a 
lot of research work and the development, of many statistical 
techniques are accredited to them. Alexander Mc Farlane Movo 
in his book “An Introduction to the theory of Statistics” appre- 
ciating the contribution of Fisher to the Science of Statistics 
observes, “Fisher is the real giant in the development of the 
theory of Statistics. His first paper was published in 1912 and 
his work continues unabated today. Although hundreds of 
scholars have contributed to the science of statistics, this one 
man must be credited with atleast half of the essential and 
important developments as the theory now stands.” Fisher is 
regarded as the greatest figure in the history of statistics. In 
India Prof. Mahalanobis has contributed a lot in the theoretical 
and applied field of statistics. In the applied field the names of 
Dr. V. K. R. V. Rao, R. C. Desai ete are also noteworthy. 
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For the past two decades there has been a remarkable and 
sustained growth in the theoretical and applied field of statistics. 
Statistics has become now a universally applicable science. The 
fields in which application of statistics can be made are 
numerous and diverse. That is why Dr. A. L. Bowley remarked 
that, *A knowledge of statistics is like a knowledge of foreign 
language or algebra, it may prove of use at time under any 
circumstances." 

Meaning of Statistics. The word ‘STATISTICS’ is used 
in two senses. Commonly statistics is used to refer to the data 
themselves, as 'statisties of national income, or of accidents or 
figure measurements of one's favourite movie actress. 'To most 
people statistics mean numerical data—figures. This is the 
use of the word statisties in plural. In the Science of statistics 
it is better to call it data. 

But there is also a field of knowledge of statisties. Here 
the word “Statistics” is used in singular. In this sense we 
refer by the term ‘Statistics’ the whole field of study of which 
Statistics in plural sense are the subject matter. The subject 
statistics is concerned with the collection, presentation, description 
and analysis of data which are measurable in numerical terms. 
In other words ‘statistics’ refers to the statistical principles and 
methods which have been developed for handling numerical data. 

Statistics as a method of research. Statistics is not а 
science, it is a scientific method. There are two methods which 
are employed in conducting research, viz experimental method 
and statistical method. The experimental method is adopted in 
physical or exact sciences like Physics, Chemistry еіс. Under 
this method the factors under investigation are isolated and 
variated according to a predetermined plan, while other factors 
having a bearing on the observed phenomena are kept constant. 
This is what is done in all experiments carried on in the 
laboratory. However in the social sciences like Economics, 
Sociology, Politics ete which are less exact sciences experimen- 
` tation is seldom possible, because it is not possible to isolate 
the factors under investigation. For example if it is sought to 
study the effect of change of income on the health of the 
community, it is not possible to keep constant all other factors 
affecting health and vary income according to pre-determined 
plan. Thus in such cases experimental method connot be applied. 
The statistical methods and procedures constitute a useful and 
often indispensable tool for research workers of social sciences. 
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Without an adequate understanding of statisties, the investigator 
in the social sciences may be like a blind man groping in a dark 
room for a black cat which is not there. The statistical methods 
are useful in an ever-widening range of human activities in any 
field of thought in which numerical data may be had. 

But it must not be assumed that statistical method is the 
only method to use in research, neither should this method be 
considered the best attack for every problem. Just as а 
carpenter has а number of tools each appropriate for a different 
work, so the researcher can use various techniques which are 
the tools of his trade and each of which is appropriate to а 
specific type of situation. If a new carpenter uses screwdriver 
in lieu of a chisel, the results are not likely to be either 
workmanlike or satisfactory. Similarly it is important that 
the investigator should examine the problem under investigatior. 
very carefully and make use of technique appropriate to it. 
Just as a earpenter needs to use more than one tool in com- 
pleting a piece of work, similarly a research worker must make 
use of not one but several methods. However, it should be 
kept in mind that the classification of scientific methods into 
those which are experimental and those which are statistical, 
like most classifications, is formal and arbitrary and not entirely 
realistie. Even in exact sciences, where experimentation is 
possible statistical methods also help in getting correct results. 
When a scientist comes to work on a problem in practice, he 
usually combines elements of both the statistical and the 
experimental approach. Many of the most important of the 
Statistical methods were originated in the fields of physies and 
astronomy—fields that we usually think of ‘exact’ sciences. 
Even in these fields the scientist has to content with errors of 
observation and in addition he usually finds it impossible to 
record the values of all the variables which are involved. 
Under such circumstances the exact scientist is forced to 
combine statistical methods with his experimental procedures. 
On the other hand even the social scientist can and does use a 
certain amount of control in his investigations. Thus experi- 
mental method and statistical method are both necessary for 
research just as the right and the left feet are needed for 
walking. 

Statistical methods are those methods by which statistical 
data are analysed. According to a memorandum prepared by 
a Committee of the Royal Statistical Society, London, “The 
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methods of statistics in the modern sense range from the mere 
recording and tabulation of numerical data to subtle processes 
of inductive reasoning based on the mathematical theory of 
probability." The statistical methods are tools for handling 
data, these tools include from collection of data to final 
interpretation of data. ; 

Definition of Statistics. Statistics has been defined by 
different statisticians to cover two seperate concepts : (1) 
descriptive statisties or data and (2) statistical methods. 
According to the first concept, statistics is expressed as 
pertaining to numerical data. This concept takes statistics in 
the plural sense. According to the second concept statistics is 
expressed as a science. It takes “Statistics” in the singular 
sense. 

A—Definitions According to the first concept :—(i) 
According to this concept, the most exhaustive definition 
has been given by Prof. Horace Secrist. He defines— 

“Ву statistics ме mean aggregate of facts affected to a 
marked extent by multiplicity of causes numerically expressed, 
enumerated or estimated according to reasonable standards of 
accuracy, collected in a systematic manner for a pre-determined 
purpose and placed in relation to each other." 

This definition mentions the characteristics which, data, 
the subject matter of statistics should possess. According to 
this definition statistics in plural sense should have following 
features :— 

(1) Statistics are aggregate of facts—Single or unrelated 
figures are not statistics, because they do not throw light on 
any problem. The figures like 20, 25, 18, 23 etc cannot be 
called statistics, but if they are placed in a series indicating 
that for the ages of 20 and 25 years of the husbands the respec- 
tive ages of wives are 18 and 23 years, then these figures become 
statisties. Studies of individual item like single death, birth, 
sale, purchase ete are not important from the point of view of 
statistics, as comparison is not possible. 

(2) They are affected to a marked extent by Multiplicity 
of Causes—Statistical data are subject to the influence of large 
number of factors and it is not possible to single out the effects 
of one individual factor. For example if data regarding agri- 
cultural production are collected, the agricultural production is 
affected by a number of factors like rains, quality of soil and 
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seed, method of cultivation, manuring etc. and no individual 
factor can be studied seperately. 

(3) They must be numerically expressed—Qualitative 
expressions like good, poor, young, old ete do not form part of 
statistical studies. Statistical studies are possible only of those 
features which can be expressed in quantitative terms. If 
economic conditions of the inhabitants of a country are to be 
studied, the divisions like rich, middle-class and poor will not 
be of much help unless these divisions are made on some 
quantitative ground like persons having income above 1000, 
persons having income between 1000—250, and persons having 
income below 250. 

(4) They should be enumerated or estimated according 
to reasonable standard of accuracy—It is natural / that 
estimated figures cannot be absolutely accurate and precise. 
For obtaining correct results, chances of errors should be 
minimised by maintaing a reasonable standard of accuracy. The 
standard of accuracy will depend upon the purpose of enquiry. 
The standard of accuracy cannot be uniform for all types of 
enquries. For example if heights of a group of students are 
being measured, the measurements may be taken as right, if 
they are correct to a forth of an inch, but while measuring 
distance from Delhi to Gwaliar such accuracy is not wanted. 
In that case a few furlongs may be ignored. 

(5) They should be collected in a systematic manner for 
a pre-determined purpose—The collection of facts should be 
systematic, hap-hazard collected figures are not desirable 
because they may lead to wrong conclusions. The object. of 
collecting facts must be clear and definite and must be deter- 
mined beforehand. Because this will guide in the collection 
of data. 

(6) They should be placed in relation to each other— 
Statistics are collected mostly for the purpose of comparison 
between two or more phenomena. If collected data are not fit 
for comparison, much of their importance is gone. То make data 
fit for comparison, they should be homogeneous and uniform. 
Hetrogeneous data are not comparable. 

(ii) Webster defines statistics as, “Statisties are classified 
facts respecting the condition of the people in a state.... 
especially those facts which can be stated in numbers or in 
tables of numbers or in any tabular or classified arrangement.” 
This definition limits the scope of statistics, According to this 
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definition, only those facts are called statistics which are related 
to the condition of the people in a state. Facts regarding 
Physies, Chemistry, Sociology and Psychology will not be called 
statistics. In the present age statistics are collected in respect 
of all aspects of human activity. Hence this definition is inade- 
quate and not comprehensive. 

(iii) According Dr. A.L. Bowley, “Statistics are numerical 
statements of facts in any department of enquiry, placed in 
relation to each other.” According to this definition statistics 
have three characteristics viz. (A) They are numerical state- 
ments of facts (B) They are concerned with an enquiry and (C) 
They are placed in relation to each other, for comparison. Other 
characteristics are not included in this definition. 

(iv) Yule and Kendall state that, “By Statistics we mean 
quantitative data affected to a marked extent by a multiplicity of 
causes.” This definition is also incomplete as it mentions only 
two characteristics of statistics viz (A) they are quantitative 
and (B) they are affected by multiplicity of causes. 

(v) According to Connor, “Statistics are measurements, 
enumerations or estimates of natural or social phenomena, 
systematically arranged so as to exhibit their inter-relations.” 
This definition states that (A) statistics are measurements, 
estimates or enumeraions, (В)! They relate to some natural or 
social phenomena, (C) They are systematically arranged to 
exhibit their inter-relations. 

(B) Definitions According to the Second Concept—These 
definitions are based on the conception that “Statistics is what 
Statistics does.” These definitions define statistics as statistical 
methods. Here the word statistics has been treated in 
singular. 

(i) Dr. A.L. Bowley has given а number of incomplete 
definitions which touch only some of the aspects of statistics. 

(a) At one place Bowley says that, “Statistics may be called 
the science of counting.” No doubt it is true that counting is 
a process in statistical methods, but counting alone is not 
statistics. Besides counting, estimates and probabilities are 
also equally important processes. Counting is only one process 
in statistics. This definition is applied only to the collection of 
data and not to its analysis and interpretation which are equally 
important. 

(b) At another place Dr. Bowley remarks that, “statistics 
may rightly be called the science of averages.” Averages аге 
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no doubt very important in statistics, because they refer to the 
central tendency of the data, but besides averages, measures of 
dispersion, skewness, correlation etc and presentation of data in 
the form of diagram and graph is not less important. This 
definition too is incomplete and includes only one aspect of the 
science. It fails to draw attention to the fundamental nature of 
statistics. 

(c) Another definition given by Dr. Bowley is, statist ‘es 
is thus “the science of measurement of social organism терага. а 
as а whole in all its manifestations." This definition limits + в 
application of the statistical methods to social Sciences only. it 
makes statistics as a branch of Sociology. He himself t 
another place expresses contrary view that, "Statisties is rot 
merely a branch of political economy nor it is confined to any 
one science. Statistics can be applied in varied fields." Hence 
this definition too is unsatisfactory. 

Gi) Like Dr. А. І, Bowley Prof. Boddington defines 
Statistics as, “Statistics is the science of estimates and proba- 
bilities.” This definition is not complete as it throws light on 
one aspect of the science. The use of ‘estimates’ and *proba- 


bilities’ is becoming popular in statisties, but other techniques 
are also employed. 


He lays more emphasis on the interpretation aspect of statistics 
rather than collection and analysis aspects, 

(iv) Other definitions given by noted Statistieions are— 

(a) "Statistics is the Science which deals with the methods 
of collecting, classifying, presenting, comparing and interpre- 
ting numerical data collected to throw some light on any Sphere 
of enquiry." SELIGMAN 

This definition is Short, simple yet comprehensive, 

(b) "Statistics is the Science and method of analysing 
groups of related numbers in order to discover their relation- 
ship and meanings." BLAIR 

(с) “Statistics deals with the collection, classification and 
tabulation of numerical facts as the basis for explanation, 
description and comparison of phenomena.” Lovirr 
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(d) "Statistics is the science and the art of handling 
aggregate of facts—observing, enumerating, recording, classify- 
ing and otherwise systematically treating them." HARLAWS 

(e) “The subject ‘Statistics’ is concerned with the collection, 
presentation, description and analysis of data which are 
measurable in numerical terms." P. H. KARMEL 

(f) "The subject in this sense of satistics is a body of 
methods of obtaining and analysing data in order to base deci- 
sions on them." W. А. WALLIS & Н. V. ROBERT. 

(g) “Statistics is a body of methods which are used when we 
wish to study masses of numerical data and to extract from 
them a few simple facts.” A. E. WAUGH 

Like economists, statisticians also have not agreed upon à 
proper definition of statistics. It is rather difficult to limit the 
scope of this very wide subject into a few words by defining it. 
But by studying and analysing these definitions we can conclude 
that— 

The term 'Statisties' (plural) is used to mean numerical 
data capable of analysis, and interpretation. Statistics (in the 
singular sense) is an art and science of collection, presentation 
analysis and interpretation of numerical data. 

Divisions of Statistics 

The science of statistics can be divided into following 

broad divisions :— 
(a) Theoretical Statistics 
(b) Descriptive Statistics 
(e) Inductive Statistics 
(d) Applied Statistics. 

(a) Theoretical Statistics —The Mathematical theory which 
is the basis of the science of statistics is called theoretical 
Statistics. 

(b) Descriptive Statistics or Statistical Methods—The 
descriptive statistics is concerned with collecting, tabulating, 
analysing, interpreting and presenting numerical data. 

(c) Inductive Statistics—The inductive statistics is a set of 
intellectual tools based upon the mathematical theory of proba- 
bility, which enables us to use partial or limited numerical in- 
formation for producing generalisations, estimates, predictions 
and decisions in such a way that the falliability of the conclusions 
can be assessed. 


uL. 
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(d) Applied Statistics—The applied statistics consists of 
application of statistical methods and techniques to the 
problems and facts as they exist. Quality Control, sample 
surveys, etc are included in this division. 

Nature of Statistics. The statistical methods are inductive 
in their nature because generalisations result from the observa- 
tion of individuals. Generalisations made after a statistical 
investigation show that they are true on an average. It shows 
only the typical behavious of all the items. It does not describe 
the behaviour of individuals taken separately. Because there 
is greater stability in masses than individuals. Thus statistical 
generalisations—whatever their form may be, provide estimates 
of the characteristic behaviour of populations and not of the 
behaviour of individual members. These generalisations are of 
great value and many good things come up over these. For 
example the business of insurance has developed only due to 
certain statistical generalisations. In insurance “we do not 
know who will die, but we know how many will die." 

The Subject Matter of Statistics. The methods by which 
statistical data are analysed are called statistical methods. 
Statistical methods аге specially appropriate for handling 
data which are subject to variations that cannot be fully 
controlled by experimental methods. ‘Statistical methods are 
devices by the application of which quantitative data influenced 
by multiple causation are collected and so scientifically analysed 
and elucidated that they are brought within easy and clear 
grasp.’ Thus they include all the processes by which data are 
collected and scientifically analysed. Statistical methods include 
collection of data, their classification, tabulation, comparison, 
correlation and finally interpretation. These methods are 
employed in both, exact and inexact sciences, Statistical 
methods are therefore, the handmaid of both physical and social 
sciences, but are of greater service to the latter. These methods 
constitute statistical methods. According to Н. Secrist, 
“Statistical methods include all the devices of analysis and 
synthesis by means of which statistics are scientifically collected 
and used to explain or describe phenomena either in their 
individual or related capacities.” 

Whether Statistics is a Science or Ап Art? Now question 
arises whether Statistics is a science or an art or both. Science 
is a body of systematised knowledge. It studies cause and 
effect and tries to find out generalisations which are called laws 
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of that science. It does not tell whether the result is good or 
bad. It only describes what are the facts. It is like a light- 
house that gives light to the ships to final out their own ways 
but does not indicate in which direction they should go. 
According to Karl Pearson that knowledge which (a) gives 
mental education to the citizens, (b) throws light on important 
Social problems (c) gives happiness in practical life and (d) 
gives satisfaction in our artistic faculties, may be called science. 

Statistics is not a science like physics, chemistry, economics 
and sociology. It may be called a science of scientific methods. 
Statistics help other sciences to derive their own laws. Satis- 
tical knowledge is not for its own sake but for the sake of other 
knowledges. This is a difference between science of statistics 
and other sciences. But it is very useful in the advancement of 
knowledges. This is a difference between science of statistics 
without statisties bear no fruit, statistics without science have 
no root. This statement indicates the relationship which 
exists between statistics and other sciences. 

Art is practical science. If science is knowledge then art 
is action. By science we know a thing by art we do that thing. 
Statistics is regarded as an art of applying science of scientific 
methods. In the science of statistics we do not study different 
methods of studying a problem, but we also study how these 
methods should be applied in different situations. Thus 
statistics may be regarded as science of scientific methods and 
art of applying those methods. 

Relation of Statistics with Mathematics. fiatistics isa 
branch of science which is based on mathematics, Any one 
who wishes to know the fundamentals of statistics must have 
Some knowledge of mathematics. According to Connor, “Statis- 
lies is a branch of Applied Mathematics which specialises in 
data.” It is on the theories of Mathematics that the entire 
paraphernalia of statistics is raised. The basic theories of 
statistics have been developed by mathematicians. The famous 
Statisticians like James Bernoulli, De Moivere, Laplace, Gauss, 
Fancis Galten, Karl Pearson, Fisher ete үеге great 
mathematicians. 

Relation of Statistics with Economics. There is close 
relationship between statistics and Economics. Statistics has 
developed a seperate branch of study called “Economic 
Statistics”. Statistics has helped a lot in the development: of 
Economics. The close relationship of Economics, Mathematics 
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and Statistics has given birth to a new science called 
‘Econometrics’, Statistical methods help in deriving economic 
laws and also in proving the validity of existing economic laws. 
Statistical methods make possible the development of the 
empirical side of economics. Their use is necessary to give 
real content to theoretical formulations. The inductive method 
of economic studies is based upon statistical methods. 

Factors related to the growth of Statistics. There has 
been great and continuing growth in the use of statistics. The 
demand for statisties has increased and so has the supply. 
The fields in which statistics is applied most are business, 
government and science. The extraordinary growth of all 
three of these has led to growth of the science of statistics. 
The increase in the magnitude and function of government 
and adoption of economic planning have created more demand 
of statistics. Sciences social and physical, have increased 
tremendously and they have become complex, thereby greatly 
increasing the demand for statistics. There is great demand 
of statistics in research. 

The development of statistical theory has also had the 
effect of reducing the cost of compilation of statistical data 
especially by making it possible to base reliable conclusions on 
samples. But it is wrong to think that the current theory and 
methods of statistics are final. This is an ever-growing science. 
Statistical research is today more vigorous than ever before. 


Theoretical Questions 


1. “Statistical method is less precise than experimental 
method but it can often be applied successfully when the latter 
fails". 


Discuss the above statement and explain the importance. of 
Statistical methods in social science. (M.S.W. Lucknow) 
2. Statistics is said to be both a science and an art. Why ? 
What relation, if any, has statistics with other Sciences ? 
(M.A. Agra) 
8. Define ‘Statistics’ and point out the main difficulties that 
a statistician has to face as compared with a physicist or a 
chemist. (B. Com. Allahabad) 
4. "When you can measure what you are speaking about 
and express it in numbers, you know something about it; but 
when you cannot measure it, when you cannot express it im 
numbers, your knowledge is of a meagre and unsatisfactory kind". 


(Lord Kelvin) 


INTRODUCTION 13 


Explain the above statement and show its importance in the 
theory of statistics. (M.A. Agra) 
5. "By statistics we mean quantitative data affected to a 
marked extent by a multiplicity of causes"—(Yule and Kendall) 


Explain. (M.A. Agra) 
6. “Statistics are not mere a mass of figures" Elucidate. 
(M.A. Punjab) 


7. "Statistics are aggregate of facts affected to a marked 
extent by multiplicity of eauses, numerically expressed, enumerated 
or estimated according to reasonable standard of accuracy, 
collected in a systematic manner for a pre-determined purpose, 
and placed in relation to each other’—(Secrist). Elucidate the 
above definition, bringing out clearly the characteristics of 
statistics. (B. Com. Allahabad) 

8. What are Statistical Methods ? Explain their scope and 
limitations. 

Critically examine the following definitions of Statistics :— 

(i) Statistics is the science of counting ; 
(ii) Statistics is the science of averages ; 
(iii) Statistics is the science of the measurement of social 
organism regarded as a whole in all its menifestations. 
(B. Com. Agra) 

9. "Statistical Methods include all those devices of analysis 
and synthesis by means of which Statistics are scientifically 
collected and used to explain or describe phenomena, either in 
their individual or related capacities".—(Secrist). 

Elucidate the above statement. (B. Com. Nag.) 

10. “Science without statistics bears no fruit ; Statistics 
without science have no root". Explain the above statement with 
necessary comments. (M.A. Patna) 

11. “Statistics are numerical statements of facts in any 
department of enquiry, placed in relation to each other".— 
(Bowley). Comment on this statement and explain the limitations 


of statistics in economic analysis. ? (М.А. Арта) 
12. “Statistics has been defined as the ‘Science of averages’. 
Discuss the correctness of this definition. (M.A. B.H.U.) 


18. Discuss the meaning and scope of Statistics from the 
modern point of view and indicate its пш зоте of Mie portat 
b Art d Science with which you are familiar. 

ranches of S ani (B. Com. Bombay) 


CHAPTER 2 


FUNCTIONS, LIMITATIONS AND 
IMPORTANCE ОЕ STATISTICS 


"The black-letter man may be the man of the present, but 
the man of the future is the man of statistics and the master of 


Economics. 


^» 


Justice Ourver Wenner. Houmas 


Functions of a Statistician 


According to Rhodes, “The functions of a statistician may 
properly be considered as divisible into three parts. In the 
first place he is concerned with the assembling of statistical 
data, in the second place with their analysis, and in the third 
place with the interpretation of the results of such an analysis.” 
Thus there are three main functions of a statistician :— 


(1) 


(2) 


(3) 


He has to plan a statistical enquiry, and determine 
its objects, scope etc. After all such preliminary 
work he will arrange for the collection of data. He 
will determine the sources from which data will be 
collected. After the collection of data, he will edit 
them and present them in tables. 

Secondly he will analyse the data and calculate 
various measures to show the Significance of the 
data. 

Thirdly he will interpret the data, if necessary he 
may forecast about certain facts and will give his 
suggestions. In the words of Neiswanger, “statis- 
tical workers have the task of organising and 
summarising the data of observing the variations 
revealed, of analysing the relations of preparing 
reports, explaining them and making recommen- 
dations.” 
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A statistician has to render services of a scientist as well 
as of an artist. He can perform his functions efficiently only 
when he is master of statistical techniques and laws. His work 
is not merely of a routine nature. He has to work under 
various handicaps and he should be very cautious and vigilant. 
He has to present the facts as they are, without any bias and 
manipulation. А statistician is not an alchemist expected to 
produce gold from any worthless material He is like a chemist 
who analyses facts properly and gives the results of his 
examination. Neiswanger states that, “Тһе duty of the 
statistician, therefore, goes much beyond collecting data and 
making calculations. Facts do not speak for themselves and it 
is the statistician who must interpret the statistical results 
to discover their meaning." 

Functions of Statistics, The science of satistics performs 
the following functions :— 

1—-Condensation—Human mind is unable to remember 
huge facts and figures. Statistical methods make these data 
easy to grasp. Figures are boring. А man is bound to be 
confused and lost in figures. Statistical techniques like average, 
variation, graph and diagram etc make these figures intelligible 
and understandable. If figures of exports and imports of a 
country for 20 years are given, a man may not understand the 
significance of its foreign trade, but on presenting those data 
graphically № will become very clear to him. With the help 
of statistical methods it is possible to understand the whole 
thing in a short time and in a better way. 

2—Comparison—Comparison · іп quantitative terms is 
easy. Boddington states that, “The object of statistics is to 
enable comparison to be made between past and present results 
with a view to ascertaining the reasons for changes which have 
taken place and the effect of such changes in the future.” 
Hence the chief function of statistics is comparison. Statistical 
devices like averages, ratios, percentages, rates, coefficients, 
standard error etc offer the best way of comparison between two 
phenomena. 

8 Studies Relationship—Another function performed by 
Statistics is to investigate relationship between two or more 
phenomena. 'The relationship existing between demand and 
supply, money-supply and pricelevel, rainfall and agricultural 
production сап best be measured with the help of statistical 
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methods. Statisties also helps in finding out the association 
between two or more attributes. 

4—Enlarges individual knowledge—Dr. А. L. Bowley 
states that, “The proper function of statistics, indeed, is to 
enlarge individual experience.” Knowledge becomes precise 
and easy to understand with the help of statistical techniques. 
Statistics is such a master-key that it solves problems of 
mankind in every field. Many fields of knowledge would have 
ever remained closed to mankind, but for the efficient and useful 
technique and methodology of the science of statistics. 

5—To formulate policies in different fields—Statistics helps 
in formulating policies in different fields specially in social, 


economic and business fields. Various laws have also been deve- . 


loped on the basis of statistics, for example Malthus’s theory of 
population, Engel's law of family expenditure are due to the 
statistics collected by these two noted economists. 

6—Measures the effects—Not only in formulating a policy 
but also in measuring the effects of a policy, statistics acts as a 
guide. The effects of a change in the Bank Rate or tax-rate 
ete can be properly studied with the help of statisties. Without 
an adequate use of statistical data it would be impossible to 
arrive at any correct and dependable conclusion. 

"—Testing Hypotheses—Statistical methods are also 
employed to test the hypotheses in theory and to discover newer 
theory. Statistical methods are so helpful in testing the 
correctness of theories that Marshall remarked, "Statistics are 
the straw out of which, I like every other economist have to 
make bricks." 

8—Forecasting—Statistical methods are not only helpful 
in estimating the present but also in forecasting the future. 
There are special techniques in Statistics for extrapolation and 
forecasting. Almost all our activities are based on estimates 
about future, and the science of statistics provides scientific base 
for such estimates. The science of statistics also, throws light 
on the magnitude of any problem. Thus in the words of Robert 
W. Buges, "The fundamental gospel of Statisties is to push back 
the domain of ignorance, prejudice, rule of thumb, arbitrary 
and pre-mature decisions, traditions and dogmatism and to 
increase the domain in which decisions are made and principles 
are formulated on the basis of analysed quantitative facts." 


У 
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Limitations of Statistics. Statistics while an extremely 
useful science has its limitations and shortcomings which 
cannot be overcome. These limitations should be kept in mind 
while using or interpreting statistics. These limitations are :— 

(1) Does not study qualitative data :—The science of 
statistics studies only the quantitative aspect of a problem. 
Those facts which are not capable of being quantitatively 
expressed like, intelligence, poverty, honesty etc cannot be 
studied unless these attributes are reduced into precise 
quantitative terms. 

(2) Deals with the averages :—According to W. I. King, 
"Statistics largely deals with averages and these averages may 
be made up of individual items radically different from each 
other. Laws of statisties are true on an average. These laws 
are not universally applicable as the laws of Physics or Chemistry. 
Statisties deals with such phenomena which are affected by a 
multiplicity of causes, and it is not possible to segregate the 
effects of one factor as can be done in physical sciences. 

(3) Does not study individuals :—Statistics does not study 
individual items. It deals with mass phenomena. This is a 
serious limitation of statistics. If only 100 people die of 
starvation in India, the percentage of these deaths will be a 
very negligible figure ; but this does not in any way reduce the 
torture of death of the families suffered by these deaths. 
Similarly average wage of a factory worker may be high, but 
there may be certain workers who may be under-paid. Statistics 
fails to bring out such features. Statistics from the very nature 
of the subject cannot and never will be able to take into account 
individual cases. 

(4) Liable to be misused :—There are chances of statistics 
being misused. Everyone cannot make proper use of the 
statistics, The handling of statistics requires special care and 
technique. 

(5) False conclusions might be derived :—If statistics are 
quoted without their ‘context, it may lead to false conclusion. 

(6) This is one method :— Statistical methods furnish only 
one methods of studying a problem. "There are other methods 
also. These methods should be used to supplement the conclu- 
sions arrived at by the help of statistics. 

Bowley has stated, “Statistics only furnish a total, necessary 
though imperfect, which is dangerous in the hands of those who 
do not know its use and deficiency.” Statistical technique is no 
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doubt essential for processing, analysing and interpreting о 
numerical data. But the methods are by no means perfect and | 
it is very dangerous to jump into conclusion without a clear-cut 

idea about its limitations. 1 


Distrust of Statistics. Statistical methodology has yielded ` 


and is yielding important results in many fields of knowledge. 
Science of statistics is very useful indeed, but its usefulness lies 
not in the flgures themselves, but in the correct interpretations 
of those figures. Due to ignorance and bias people misuse this. 
delicate tool of knowledge, that has created distrust of the 


science of statistics. The science of statistics is looked upon . 
with a suspicious eye and condemned as a tissue of falsehood. _ 


Stephen Leacock wrote that, “In earlier times they (People) had 
no statistics and so they had to fall back on lies. Hence the 
huge exaggerations of primitive literature—giants or miracles 
or wonders ! They did it with lies and we do. it with statistics, 
but it is all the same." He treats lies and statistics as sisters. 
Disraeli, а former Prime minister of England remarked that 
"there are three degress of comparison in lying, lies, damned 
lies and statistics." A famous writer has said that ‘History 
asserts without evidence’ while Statistics asserts contrary to 
the evidence, It is also said that ‘A statistician is a person 
who draws а mathematically precise line from an unwarranted 
assumption to a foregone conclusion." 


Karston in his book on ‘Graphs and Charts’ writes, "The — 


statistician is sometimes looked upon as one whose acquaintance 
with figures is so very intimate that he can readily take liberties 
with them, abuse them, present them in a false light and deceive 
the layman. In this view he is a little more than a common 
trickster performing leger-domain with numbers, his magical 
results to be idly wondered at but not to be trusted." People 
вау that an ounce of truth will produce tons of statistics, or 
statistics are the lies of the first order. It has been remarked 
that ‘there are black lies, white lies, multichromatic lies ; 
Statistics is a rainbow of lies.” People believe that ‘Statistics 
can prove any thing”. 

These statements indicate the extent to which the science 
of statistics has come to disrepute. On the other hand it is also 
said that ‘if figures say so it cannot be otherwise’ or figures 
don't lie'. 

The reason for this diversity of opinions lies in the inno- 
cence of figures. Figures by their nature are innocent and 
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easily believable. It is human psychology that facts supported 
by statistics are easily believed. As data convey a' sense of 
precision and accuracy, it is natural to have faith in them, 
Misuses unfortunately are probably as comman as valid uses of 
statistics. The ability to discriminate between a valid and an 
invalid use of statisties is more important for most people than 
knowing how themselves to make effective use of statistics. 
Figures supporting a fact may not be true. They may be 
incomplete, inaccurate and deliberately manipulated by 
prejudiced persons. Figures do not bear a trade mark of their 
accuracy as stated by W. I. King. “Опе of the shortcomings of 
statistics is that they do not always bear оп their face the label 
of their quality.” 4 

Fault lies not with the science of statistics but with the - 
users who misuse it. "Statistics are like clay of which you can 
make God or Devil as you please.” Before we believe in figures, 
we should examine whether the figures are unbiassed and properly | 
collected and scientifically analysed. Statistics does not prove 
or disprove a thing. It is merely a tool. It is only a method 
of approach. It is a tool in the hands of a statistician to present 
facts in a precise manner. The wrong conclusions are 
presented when some persons lean on statistics like a drunk 
person on a lamp post for support rather than for illumination. It 
more often happens that a person who is interested in obtaining 
a particular result from his investigation will be unable to 
avoid mistakes in his reasoning and this will permit him to 
obtain the desired result. Hence "figures don't lie, but liars 
figure.” Statistics is only a method of investigation just as 
medicine is a method of curing disease. Only a witch doctor 
or a quack can make extravagant claims about their methods 
similarly an inexpert statistician will try to prove wrong things 
with right statistics. “The statistician is not an alchemist 
expected to produce gold from any worthless material.” Different 
approaches give different conclusions from the same set of 
figures. Hence there is need of caution. 

Statistical methods are delicate tools likely to be misused. 
Hence fault lies with the user and not with the science of 
statistics, Just as a knife may be used in cutting one's throat 
instead of cutting a sweet red apple, similarly statistics may be 
wrongly used. These innocent things can be used in any way 
we like. Thus it is desirable that only experts and trained 
persons should use these delicate tools. Bowley says, 
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“Statistics only furnish а tool necessary though imperfect which 
is dangerous in the hands of those who do not know its uses 
and deficiencies.” Neither statistics are lies nor the science of 
statistics is a science of liars. Marshall states that “Statistical 
arguments are often misleading at first, but free discussion 
clears away statistical fallacies.’ Bowley also says that, 
“Statistical methods are most dangerous tools in the hands of the 
inexpert. Statistics is one of those sciences whose adepts must 
exercise the self-restraint of an artist.” Mill is also of the 
opinion that “Аз a tool statistical method requires intelligent 
usage and that the results secured through statistical analysis 
require intelligent interpretations.” Thus in the words of . 
W. 1. King. “The science of statistics is a most useful servant, 
but only of great value to those who understands its proper 
use.” He who accepts statistics indiscriminately will often be 
duped unnecessarily. But he who distrusts statisties indiscri- 
minately will often be ignorant unnecessarily. 

The distrust is, however, slowly decreasing, because of the 
increasing interest evinced in the study of statistics and in the 
recognition of its limitations as well as improvement in 
statistical methods. Now no one whether he is an administrator 
or a scientist or a responsible citizen can afford to be misled by 
bad statistics and everybody needs knowledge that can be gained 
only through the effective use of statistics. Reliable statistics 
are the lamps that light our path on the road of knowledge. 

Sources of Errors in Generalisation. Unfortunately errors 
of methods and interpretation of quantitative data are 
common, because statistical methods of analysis and the nature 
of statistical results are unknown to many who attempt to 
analyse and interpret quantitative data. The cause of 
errors are ; 

1—Inappropriate Comparisons :—Conclusions drawn from 
non-homogeneous data present wrong picture. If health condi- 
E are compared by different diseases of a city population with 
EY itary and it is concluded that military-people are healthier 

n city people. This conclusion is wrong because population 
compared do not stand on the same footing. Military consists 
of young and healthy persons, while in a city, persons of all types 
and of different ages live. Living conditions too very much 
differ. This leads to wrong conclusion. 
бе ор aea шешш а RE 

rms, on the basis of which statistics have 
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been collected, are carelessly used. If statistics of unemployeds 
are collected in two different periods, before making comparison 
of such statistics it should be examined whether the definition 
of ‘unemployed’ remained the same for both the periods or not. 
If there has been any change in the concept of 'unemployeds' 
wrong conclusions will be drawn. 

3—Inaccurate Measurements :—Sometimes wrong conclu- 
sions are drawn due to inaccurate information being collected. 
If Government collects statistics at one time for tax purposes 
and at another time for giving some benefit, results will be 
quite different. 

4—Non-representative data :—Wrong conclusion аге 
drawn about an entire population from inadequate data, or a 
part which is not representative of the whole. If in U.S.A. 
a newspaper conducts a survey about the popularity of Mr. 
Kennedy and Mr. Khruschev. It is likely that more votes will 
come for Mr. Kennedy as the circulation of the paper will be 
limited to U.S.A. Hence it will be wrong to say that 
Mr. Kennedy is more popular than Mr. Khruschev. 

5—Inappropriate association от Correlation :—Wrong 
conclusions are drawn due to inappropriate association or 
correlation. 

6 Technical Errors :—Technical errors also lead to 
erroneous belief. Such errors may be due to disregard of 
dispersion, misleading charts etc. 

7—Misleading statements or  Fallacious induction :— 
Misleading statements based on illogical logic result in wrong 
conclusions. If it is stated that ‘one third of the women 
students at Delhi University during last year married professors 
while there were only three women students in that University 
and only one of them married to a Professor. It will be wrong 
to infer that generally women students marry their Professors. 

8—Fallacious Deduction due to Overlooking the details :— 
If an enquiry reveals that matinee show of Cinema is popular 
with the students, but there may be certain facts behind this 
conclusion. It may be due to the fact that students due to 
fear of their guardians go to see the matinee show. 

9—Errors in the use of Percentages :—Percentages are 
used to show the changes in an aggregate when the relative 
change is also important. If it is stated that production h 
declined in a factory during a year by 150%. This is wro) 
production ing by 100 will reach to zero. The f 
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were that production fall from 87,500 units to 35,000. This is 
a decline of 52,500 units which is 15095 of 35,000. But this 
base is wrong. 

10—Bias :—If a statistician is bent upon drawing a 
conclusion he will distort the facts to prove pre-determined 
conclusions. 

A careful student of Statistics must guard against these 
pitfalls and should draw conclusions which are warranted by. 
the nature of the data. 


Importance of Statistics 


"If they don't count, they won't count" thus said Anatole 
France for the Chinese people. This statement amply bears 
out the importance of statistics in the modern world. Even 
those persons who have no knowledge of statistics employ 
Statistical methods in day-to-day decisions. When a person 
. Wishes to purchase a radio or a machine he studies the price 
lists of various companies for making a choice. What he really 
aims at is to have an idea about the range within which the 
prices vary. When a farmer wishes to have a particular 
quantity of rains in a particular season for good harvest, he 
has an idea of the correlation that exists between rainfall and 
crop production. Thus all persons employ statistical methods 
knowingly or unknowingly. 


Importance of Statistics in various fields 


In Economics :—Statistical data are a powerful aid in 
economie analysis. Prof. A. Marshall the renowned economist 
observed that, "Statistics are the straw out of which I like 
every other economist have to make bricks." The study of every: 
economic problem requires the use of Statistical methods. 
` Statistical methods are the tools and appliances of an economist’s 
laboratory. Economists have been examining economic behaviour 
in new relations and contexts in an effort to throw added light 
on the complexities of modern economic organisation and to test — 
hypotheses derived from theory. Statistical data and methods 
of statistical analysis render valuable assistance in the proper | 
understanding of the economic problems and the formulation ' 
of economic policy. Economic problems almost always involve 
facts that are capable of being expressed numerically. In every - 
branch of Economies, statistics is indispensable. Statistics of 
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consumption enable us to find out the way in which people 
of the different strata of society spend their incomes. Such 
statistics are very helpful for knowing the standard of living 
and the taxable capacity of the people. The Law of Demand 
and the Elasticity of demand are based on inductive reasoning. 
Statistics of production help us to adjust the supply 
according to demand. Such statistics are the measures of 
productivity of a country. Productivity of different factors of 
production can be gauged and compared. We can measure our 
progress year after year. Census of production has become an 
outstanding feature of every nation. Such statistics are a 
store house of valuable information. In the field of exchange 
we study markets, laws of prices based on supply and demand, 
cost of production etc. A systematic study of all these facts 
cannot be made without statistics. What price should a 
monopolist charge in order to reap the maximum profit ? What 
shall be the price of a particular commodity if its supply is 
increased or decreased ? All such questions can best be answered 
with the help of statistics. Statistics are the very foundation 
stone of the theory of exchange. In distribution too statistics. 
play a vital role. With the help of statisties the national wealth 
of the country is estimated and its distribution among the 
people of the country is found out. Equitable distribution of 
nationnal income is another serious problem to be best solved 
statistically. Thus we find that there is a kinship between 
Economies and Statistics. Since the publication of the General 
Theory of Employment Interest and Money by John Maynard 
Keynes in 1936 and the preparation of the national income . 
accounts, there has been a sharp revival of interest, in attempt 
to give numerical magnitude to certain critical problems referred 
to in theory, such as the investment multiplier, to test by 
statistical means some of the hypotheses in theory and by the 
use of numerical data to discover in the context of the newer 
theory, relations in the economy not previously examined. 


There has become a new trend of bringing together economic 


theory, mathematics and applied statistics to solve various 
o work with economie 


economie problems of the day. Those wh 
theory restate theoretical propositions until they are suitable 
for statistical testing. Another recent development in economic 
statistics is input and output analysis. Since the last decade of 


the last century two important factors have brought about a 
Statisties in Economies. 


fundamental change in the place of 
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The development of statistieal methods—of probability, sampling 
and eurve-fitting, correlation ete. closely coincided with the 
enlargement of figurative data, made possible by the establish- 
ment of statistical bureaus and scientific recording of population 
censuses in different countries of the world. The improvements 
in statistical materials about the close of the nineteenth century 
mark the real inception of statisties in economies. It will not 
be inappropriate to name Statisties as the 'arithmetic of human 
welfare' today. 

A statistical approach to an economic problem not only 
leads to its correct description but also indicates lines along 
which it is to be tackled. Statistics is also being used not only 
to develop new economie concepts but also to test the old ones. 
Hence in all types of economic problems statistical approach is 
essential and statistical analysis is useful. То quote Dr. Bowley, 
“Мо student of Politieal economy can pretend to complete 
equipment unless he is master of the methods of statistics, 
knows its difficulties, can see where accurate figures are possible, 
can criticise the statistical evidence and has an almost 
instinctive preception of the reliance that he may place on the 
estimates given to him.” 

Statistics in Economic Planning. Whenever we think of 
an economic plan for a country, we have to think of statistics, 
Planning cannot, be imagined without statistics. Statistics is 
the base upon which structure of planning is based. Economic 
planning is now regarded essential for the proper and 
systematic development of a country. Economic planning has 
assumed a special importance in the under-developed countries, 
Economie planning aims at proper exploitation of the national 
resources, both men and material, so as to raise the standard of 
living of the people. Before framing a plan we have to know — 
what is our present production capacity ? What are our 
requirements ? What are the resources that can be exploited ? 
What is the trend of our population ? ete. These questions 
cannot be answered without proper statistics. If planning is 
adopted for solving some special problem, then too we have 
to know the extent of the problem. Without statistics 
economic planning will be a planning in the dark. Statistics are 
required not only for framing the plan but also for measuring 
the achievements of the plan. In India, while Planning for the 
economic development of the country, plan framers have made 
use of the statistical material available in the country. Lack 
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and inaecuracy of statistical data are responsible for many 
drawbacks and inaccuracies in our plans. Hence, we see that 
planning without proper and adequate statistics is inconceivable. 
An efficient statistical organisation and adequate and correct 
Statistical information relating to all the aspects of the life of 
a nation are essential pre-requisites to planning. On the twin 
pillars of facts and figures will rest the national policy of every 
country. "The correct interpretation of facts and figures becomes 
as essential as the careful collection of the relevant data. 
Thus in the formulation of Economie planning statistical 
organisation assumes utmost importance. 

Statistics in Government Administration. Statisties are 
the eyes of government administration. Government have since 
long collected and interpreted data concerning the state. In fact 
the word ‘Statistics’ is originally derived from State. Statistics 
are by-products of government administration. Certain statis- 
tics like that of crimes, taxes, wealth, trade etc. are collected 
in administration automatically. 

Since the conception of a welfare state and increase in the 
duties and functions of a state, importance of statistics has 
increased in government administration. 

In 1954, Arthur Е. Burus, Chairman of the U.S. Council 
of Economie Advisers to the President Eisenhower said, “What 
we debate now-a-days is not the need for controlling business 
cycles but rather the nature of governmental action, its timing 
and its extent” and this is possible only by data-collecting 
activity and to the development of statistical analysis in 
economics. Full employment policy is regarded as an essential 
function of the government. Government has to adopt all 
practical means to promote maximum employment and produc- 
tion. This can be done by adjusting its expenditure, fiscal and 
monetary polices, But adjustment can be made only on the 
basis of some statistical facts. The success of attempts to 
increase the economic security of the people by such policies 
depends to a considerable extent on statistical intelligence and 
forecasts of economic behaviour. 

Statistics in Commerce and Business Administration. 
Statistical methods have been increasingly used in business. 
One element common to all problems faced by business managers 
is the need to make decisions on the face of uncertainty. It is 
not.surprising that the statistical methods are widely applied 
by business managers in all types of managerial decisions. 
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Statistical methods are applied in market and product research, 
investment policies, quality control of manufactured products, 
selection of personnel, economic forecasting auditing and many 
others. According to Prof. Boddington, “In order to succeed 
in any business today, the businessman must study all the 
factors which enter into production, buying and selling, 
exporting and importing of goods in which he deals." 

Business is regarded as a profession of forecasting which 
requires a high degree of skill. Success in business depends 
upon precision in forecasting. A businessman must make a 
proper and scientific analysis of the past records in order to 
forecast the future business conditions. Business barometers 
are constructed with the help of existing records to tell us the 
future course of business events. 

The aim of doing a business is to earn profit. Profits can 
be increased either by increasing turnover whick in these 
days of keen competition requires special technique or by 
reducing expenses. The latter course is easier to follow. 
Scientific management has gained ground in business enterprises. 
It aims at maximum production with minimum of waste. For 
scientific management a careful analysis of the past records 
is required. Time, motion and fatigue studies are based on 
Statistical techniques. 

Every businessman irrespective of his nature of business 
has to employ statistical techniques in estimating trend of 
prices, trend of economic activities ete. Statistics are equally 
important to the stock and share brokers, speculators and 
investors. They have to study and compare the prevailing rates 
at different places. Insurance business too would not have 
developed but for the development of statistics. "Theory of 
probability works itself out fully in the field of insurance. 

In the management of a business enterprise, statistical 
methods serve a myriad of purposes. At any one period, there 
13 a management programme in the form of production, purchase, 
inventory, capital, personnel and sales. Such programmes are 
framed on the basis of statistics. 

Many business undertakings now have a statistical depart- 
ment functioning independently or combined with the production 
or accounts department. That department compiles various 
statistics to be used by the management for determining issues 
connected with the business. Businessman are gradually 
learning to give statistics the place it deserves. The quality 


FUNCTIONS, LIMITATIONS AND IMPORTANCE OF STATISTICS 27 


control, budgetory control, cost contrat etc are goining ground in 
business world. 

Statisties no doubt, is a useful technique in the hands of 
business administrators. It is to be supplemented by practical 
knowledge and experience so as to yield good results. 


Theoretical Questions 
1. What are the important duties of a statistician? Under 
what conditions would he be successful in his mission ? 
(M. Com. Raj.) 
2. The application of statistical methods to investigations 
is generally based on assumptions, it is subject to limitations and 


often leads to uncertain inferences.— Comment. (M.A., Agra) 
3. Write an essay on the Fundamental concepts of Statistical 
Science. (M. Com. Vikram) 


4. “Statistics only furnish a tool, necessary though imperfect 
which is dangerous in the hands of those who do not know its 
. uses and deficiencies."  (Bowley). Discuss the above statement 
and explain the importance of statistics. (B. Com. Agra) 
5. Define ‘Statistics’ and show how it can help the extension 
of scientific knowledge, the establishment of a sound business and 
the formulation of a plan for national economic development." 
(B. Com. Agra) 
6. Discuss: “For some subjects statistics provides ideas of 
basic importance ; for some it provides methods of investigation. 
In one way or the other, or in both ways Statistics has an 
important bearing on most other branches of knowledge." 


(M. Com. Agra) 
7. ‘Statistics are like clay of which you can make a God or 
Devil as you please.'— Discuss. (B. Com. Alld.) 


8. 'Statistics should not be used as a blindman does a 
lamp post for support instead of for illumination’. Comment on 
the above remark. (B. Com. B. H. U.) 

9. 'Statistical methods are most dangerous tools in the hands 
of the inexpert'. Statistics is one of those sciences whose adepts 
must exercise the si of e puc Жн i 

у> i ignificance о; е above statement. 

Explain fully the sign: (B. Com. АПА) 

10. ‘A Statistician is not an alchemist expected to produce 
gold from any worthless material.’ Comment on this statement. 
j ; (M.A. Punjab) 

11. Write a short essay on the application of modern 
Statistical technique to economic problems, illustrating your answer 
with reference to atleast three concrete examples. 

(M.A.. Agra) 


12. Write an essay on ‘Statistics in the Service of State. 
18. Discuss the importance of the study of statistics and 
show how it can help the extension of scientific knowledge, the 
establishment of a sound business and the introduction of political 
reforms. (B. Com. Agra) 
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14. What is statistics ? Discuss fully the importance of 
statistics in the management of a business enterprize. 

(B. Com. B. H. U.) 

15. Discuss fully the importance of Statistics as an aid 


to Commerce. (B. Com. Alld.) 
16. Discuss the importance of Statistics for National 
Planning in India. (M.A. Agra) 


17. (a) Reconcile the following statements :— 

(i) ‘With Statistics anything can be proved.’ 

(ii) ‘Figures do not lie.’ 

(b) Give the limitations of Statistical methods. 

18. Comment on the following statements :— 

(a) Statistics are not worth the cost and labour involved 
in their collection and maintenance in ordinary 
business. 

(b) Statisties should be handled only by experts. 

(B. Com. Agra) 

19. ‘Statistics are the straw out of which I like every other 
economist have to make bricks.’ (Marshall) 

Explain in the light of the above observation the relation 
between Economics and Statistics and discuss how far it is correct 
to say that the science of economics is becoming statistical in its 
method. (M. Com. Alld.) 

20. Point out the chief limitations of statistics. 

21. Write a short note on ‘Importance of Statistics in 
Economic Planning.’ How far is the available statistical material 
in India adequate to form a reliable basis for the preparation of 
a draft frame work of Third Five Year Plan ? 


(M. Com. Vikram) 


СНАРТЕЕ $ 


STATISTICAL ENQUIRIES 


"Statistical enquiries mean some sort of investigation by any 
agency whatsoever wherein relevant information is collected in 
numbers rather than in words." 


Statistics are collected incidentally or intentionally. А 
considerable amount of statistics are available as a result of 
administrative functions of the government and are collected 
only incidentally e.g. statistics of crime, accidents, tax-payers, 
imports exports ete. These statistics are not collected primarily 
for research purposes, although they may he very useful in the 
field of research. On the other hand, some statistics are collected 
for their own sake. These intentionally collected statistics are 
the results of statistical enquries. The statistical enquiry may 
be a general purpose or a special purpose. A general purpose 
enquiry attempts to obtain data which may be useful for many 
purposes and it does not try necessarily to obtain data to answer 
specific problems. The best example of a general purpose 
enquiry is a population census. A special purpose enquiry tries 
to obtain information in a formi suitable for analysing a specific 
problem. 

Generally by the term ‘enquiry’ we mean ‘a search for 
knowledge. Statistical enquiry therefore means a search for 
knowledge conducted through statistical methods. Only that 
knowledge can be acquired by a statistical investigation which 
can be expressed numerically. A statistical enquiry has to pass 
through the following stages :— 

(a) Planning the Statistical Enquiry, 

(b) Collection of Data 

(c) Editing and Presentation of the data 

(d) Analysis 

(e) Interpretation and Preparation of report. 

Planning the Statistical Enquiry. Before Planning for a 
statistical enquiry, it is necessary to have some preliminary 
analysis of the problem in question. The Planner of the enquiry 


should decide about the following points :— 
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(1) Object and Scope of the statistieal enquiry 
(2) Sources of Information 

(3) Type of enquiry to be conducted 

(4) Statistical Units and their definition 

(5) Degree of accuracy desired. 

1—Object and Scope of the Statistical Enquiry. А statement 
of objectives is of basic importance because it determines the 
data which are to be collected, the characteristics of the data 
which are relevant, the relations which are to be explored and 
the content and forms of the final report. The object of an 
enquiry may be either to test some hypotheses and verify some 
assumptions or to collect information on some problem. 

The scope of an enquiry is determined by its objectives. 
In practice, the difficulties of actually collecting data are often 
not as great as the difficulties which arise at the outset in 
defining the scope of the enquiry. The scope of an enquiry can 
be determined by reference to two questions—first the problem 
of deciding exactly what information the statistician wants and 
second the best way of getting it. 

2—Sources of Information. When the purpose and scope 
of an enquiry have been stated and agreed upon, the next 
thing is to determine the sources of data. The sources of the 
data may be :— 

(a) Internal and external. Data which come from the 
internal records of an organisation and relate to the operation 
of an individual organization are referred to as internal data. 
Statistics which are gathered from a number of organisations. 
or units are referred to as external data. 

(b) Primary and secondary. A Primary source of 
collecting data is one where the same authority conducting 
enquiry gathers, analyses and publishes that data. A secondary 
source of collecting data is one where other authorities have 
gathered those data and they are responsible for such data. 
This distinction is one of degree only and not of kind. Primary 
data become secondary data for other researchers. 

3—Туре of enquiry to be conducted. There are various 
factors that determine the type of enquiry to be conducted. 
One very important factor is the object and scope of the enquiry. 
Whether sample enquiry is to be conducted or a complete count 
is to be taken, depends upon the object of the enquiry. Nature 
of enquiry is also a determining factor in ascertaining the type 
of enquiry. If it is desired to find out the normal yield, sample 
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enquiry will give fairly accurate results, on the other hand for 
finding out the total area under cultivation, complete enumera- 
tion should be taken. The answer to the question, who is 
conducting the enquiry also affects the decision about the type 
of enquiry. If government is conducting enquiry, it is easy to 
get information from the people as the government can 
compel people to furnish necessary information. For Private 
agencies and individuals only moral force can be applied to 
collect the required information. All these factors are to be 
given due consideration before decision is arrived at about the 
type of enquiry to be conducted. Cost factors is also to be 
considered in detail. The cost factor determines the course of 
enquiry and its thoroughness. Form of enquiry also depends 
upon this factor. 

There are various types of statistical enquiries :— 

(a) Census or Sample 

(b) Confidential or open 
(c) Direct or Indirect 

(d) Original or Repetitive ' 
(e) Extensive or Limited. 

In a census enquiry every item of the population is surveyed. 
A sample enquiry on the other hand surveys only a group of 
items of the population which are taken as the representative 
of the population. 

A. Confidential enquiry is meant for some private purpose. 
The results of such an enquiry are not made publie. The 
enquiry, the results of which are published, is called an open 
enquiry. ў 
А direct enquiry is one in which the data are capable of 
quantitative expression e.g. production of cloth. In an indirect 
enquiry direct quantitative measurement is not possible eg. 
knowledge, intelligence ete. Such qualitative information is 
first converted into numbers than it is passed through the 


Statistical machine. ү Я 
An original enquiry is conducted for ће first time, while 


a continuous or repetitive enquiry is one where some enquiry 
has already been conducted for that purpose, it is merely a 


periodical repetition of such an enquiry. 
An extensive enquiry is one where a number of aspects of a 


problem is enquired into. A limited enquiry touchs only one or 
two aspects of a problem. 
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4—Siatistical Units and their Definition. In order to 
produce adequate and appropriate results, statistical units must 
be clearly defined. Though defiining a statistical unit depends 
upon the purpose of the enquiry. Statistical unit is that thing 
in terms of which investigator will count or measure. 
The statistical enquiries may pertain to ‘Income’, Accident, 
‘Employment’ ‘Wage’ etc. These are called statistical units. 
These words carry wide meaning, for example employment may 
be, permanent or temporary, total or partial etc. Similary 
‘accident’ connotes different ideas to different persons. To the 
industrial inspector, it may mean one thing. The insurance 
company may regard an accident only when compensation claim 
arises, while to a doctor it may be an accident when his services 
are required. The definition of statistical units should have 
the following essential requisites. 

(1) It must be very clear simple and self-explanatory. 

(2) It must be definite, specific and ascertainable, 

(3) It must ensure homogeneity and uniformity. 

(4) It must be stable and standardised. If the meaning 
of a statistical unit is not stable, the data will lose the 
comparable value. 


(5) It must be suitable to the subject. of the enquiry being 
conducted. 


There are two kinds of statistical units as shown in the 
following chart, 


Statistical Units 


Units of Measurement Units of Presentation 
| 


| | 1 


Unit of enumeratiori Unit of Analysis Time 


Space Condition 
АН (1) Year (1) Nation (1) Qualitative 
a || [ Ratio, (2) Month (2) State (2) Quantitative 
Simple Composite Coefficient (3) Day (3) City ete. 


ete, 


Units of enumeration—These are the units in which data 
are measured and collected. Units of enumeration may be 
either (1) simple or (2) composite. A simple unit is one 
which prescribes a Single determining characteristic e.g. а ton, 
a mile etc. such units are easily defined. A composite unit, on 
the other hand, is one which is based upon more than one 


determining characteristics such аз a tone mile a pasanger 
mile ete. 
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5—Degree of Accuracy. Before starting the actual work 
of a statistical enquiry, the level of accuracy must be determined 
keeping in mind the time and money available for the purpose. 
It should be kept in mind that absolute accuracy is impossible 
to achieve. Therefore a reasonable standard of accuracy should 
be aimed at. Perfect accuracy is unattainable even in a complete 
count. "Efforts should be made to maintain the desired degree of 
accuracy throughout the enquiry. 

Collection of Data. After these Preliminaries are given 
due consideration, then comes the task of collection of data. 
The method of collection of data depends upon the nature, 
object and scope of enquiry on the one hand and availability 
of money and time on the other. Statistical data may be either 
primary or secondary. Primary data are collected by the 
investigator himself for the purpose of enquiry. Secondary 
data are collected by other agencies for their own use, but such 
data may also be used by others. Primary data once collected 
and published becomes secondary data for other investigators. 
Primary data are collected for the enquiry itself, hence such 
data are most suitable for the purpose. 

Collection of Primary Data. In collecting primary data 
the following methods may be used :— 

(a) Direct. Personal Investigation, 

(b). Indirect Oral Investigation, 

(c) Information through correspondents, 

(d) Schedules to be filled in by the informants, 

(e) Schedules to be filled in by the investigators. 

a—Direct Personal Investigation. Under this method the 

investigator interviews personally every one who is in a position 
to supply the information he requires. The investigator 
establishes personal contact with the informants and conducts 
on the spot enquiry. In olden days Le Play collected information 
about family budgets of labourers in Europe. Professer Zweig 
also collected data after personally interviewing 400 people for 
his book ‘Labour, life and Poverty published in 1948. This 
method possesses certain advantages viz. (1) By this method 
original data are collected, (2) Correct and required information 
is gathered. (3) As the data are collected by one person, there 
is uniformity in collection of data. 

But there are certain disadvantages of this method also. 
They are— 


3 
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(1) Such a method has limited value. Such method can 
be used in very few cases because most statistical enquiries 
cover a wider field than any single investigator could possibly 
examine personally within any reasonable time. 

(2) This method is very costly and requires more time. 
(3) Personal bias may vitiate the results. 

b—Indirect Oral Investigations. This method is used in 
case where informants are reluctant to give information or data 
are of such complicated nature that it is difficult to get them 
directly. Enquiry committees and commissions mostly use this 
method. Persons who are supposed to possess information on 
the problem under investigation are invited to give evidences. 
Such persons are known as witnesses and their answers are 
recorded. In this method only those persons should be inter- 
viewed who, (1) possess full facts of the problem, (2) are not 
prejudiced, (3) are capable of expressing themselves correctly 
and (4) are not motivated to give colour to the facts. The 
success of this method depends upon the personal qualities of the 
interviewers—their tact, courage, and intellectual curiosity and 
the extent to which they understand the psychological and 
instinctive reactions of the persons interviewed. Great care 
and vigilence is needed in assessing the correct value of such 
information. Such data should not be taken at their face value. 
Due allowance must be made for the conscious and unconscious 
bias of the person giving information. 

c—Information Through Correspondents. Generally this 
method is not used in the collection of statistical material. 
This method can be used where field of enquiry is very much 
limited and degree of accuracy desired is not a very important 
factor. Under this method local correspondents are instructed 
to send their own views about a particular problem, This 
method is very cheap and yields results easily and promptly. 

d—Schedules to be filled in by the informants. Under 
this method a schedule or questionnaire is prepared. The 
questionnaire contains a set of questions on the problem 
under investigation. These questionnaires are addressed to 
individual informants and are sent by post. This method is 
the least expensive and a vast area can be covered and informa- 
tion can be gathered in a comparatively short period of time. 
llersie states that, “This method, at one time extensively 
employed, possesses the apparent advantage that a very large 
field of enquiry may be covered at relatively low cost, and the 
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larger the coverage the less significant will be occasional errors 
in the filing up individual forms." Under this method 
original data are collected. But this method of collecting 
information is not very satisfactory due to the low proportion 
of returns. Generally most of the informants do not return 
the questionnaire. Moreover the returned questionnaire are not 
carefully filled in. Hap-hazard answers are given. There is 
also a possibility of misunderstanding the meaning of a question. 
Only by compulsion or persuation information can be gathered. 
Success of this method depends upon tactful preparation of the 
questionnaire. It is better to disclose the purpose of enquiry and 
the identity of the investigator to the informants. Informants 
should also be ensured that the outcome of the investigation will 
in no way harm them, and information given will be kept strictly 
confidential. This method is used very much by governmental 
agencies. Informants are compelled to give information in a 
particular form regularly to the agency. Joint Steck Companies 
furnish regularly information regarding their activities to the 
government and this information is later published. 
@—Schedules to be filled т by the Investigators. The 


task of filling in the questionnaire may be delegated to selected 
provided 


Investigators or enumerators are 
with a standardised questionnaire and explicit instructions as 
to the mode of its completion and the information to be elicited. 
The main problems in this case are the selection of suitable 
enumerators and the cost involved. They should be tactful, 
honest and painstaking. They have to tackle persons of different 
nature, therefore according to the nature of informants, they 
should adopt ways to collect information. They should also be 
fully conversant with the purpose of enquiry, because ignorance 
of the problem will seriously affect the value of the results 
obtained. This method is the most common method being 
employed by all research organisations. Information received 


under this method is highly reliable. 
Drafting of a Questionnaire. 1 
information from human populations in such form and with 
sufficient exactness to be useful in scientific analysis is a most 
difficult problem. People have their own whims and feelings of 
pride, desire and prejudice. The success of a statistical investi- 
gation depends upon tactful drafting of its questionnaire. The 
person framing the questionnaire needs a detailed knowledge 
of the field of enquiry. Before attempting to draw up the 


investigators. 


The task of eliciting 
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questionnaire it is desirable to set out in detail the ideal data 
which we desire from the answers to the questionnaire. It 
might be wise to go even a step further and actually construct 
the sorts of tables which we should like to emerge from our 
enquiry. It is also desirable to try the questionnaire in a 
pilot study before they are put to any large scale use. Since 
the value of the results obtained from the enquiry depends 
largely on the adequacy of the questionnaire, the following points 
Should be borne in mind during its preparation. 

1. People do not enjoy form-filling or answering questions. 
Therefore it should be as short as possible. 

2. The questions should be clear, unambiguous and 
precise. They should be capable of being answered in only a 
limited number of ways. Complicated and long-winded questions 
irritate the informants and result in careless replies. 

3. Questions should be such which can be answered as 
far as possible by either ‘Yes’ or ‘No’ or by a name or figure. 
Long answers should not be expected. Answers such as 
‘probably’ ‘Fairly good’ ‘Average’ etc mean nothing to a 
statistician as they signify different degrees to different persons. 
Such answers should not be expected. 
| 4. Questions should be framed with right words. Тһе 
right word in the right place will ensure the validity of answers. 
Words used should be such whose meanings are clear to all 
informants. Words prevelant in the region of enquiry should 
be used. In the U.S.A. there are several lists which are 
available for reference in choosing words in devising question- 
naires, common list are Stanley L. Payne's list, Dale list, Lorge 
Count's list etc. 

5. Questions should be capable of objective answers. 
Avoid questions of opinion. For example instead of asking 
"whether he is content with his present job' it is better to ask 
‘if he desires to change his job’, if so, to what sort of job. 
This helps in tabulation of data. 

‚ 6. Questions affecting pride and sentiments of the people 
should not be asked. Due regard should be paid to their 
religious and political belief. Questions asking about private 
affairs should not be asked. Unduly inquisitive or offending 
questions should be avoided. 

_ 7. Certain types of questions should be avoided, because 
they will not be answered correctly. If it is asked—'Do you 
beat your wife’ or ‘Does your wife love you’ will always be 
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answered in No and Yes respectively. Open questions also 
should not be asked. Open question is one which leaves the 
door open to any answer. 

8. The arrangement or sequence of the questions is also 
important. Questions should be arranged in some logical order. 
Start with the simplest question first. Order of the questions 
should be such as to facilitate the answering of each question 
in turn. Questions should involve a logical flow of thought 
in the mind of informant. Questions should not skip back 
forth from one topic to another. 

9. Precise and definite instructions of filling in the 
questionnaire should be given. № is better to include the 
instructions in the body of the questionnaire than to make 
them foot notes or put them altogether on the other side of 
the page. Instruction too should not be too lengthy. 

10. Some care should be taken in the actual setting out 
of the questionnaire. It should be made to look as attractive 
as possible. Plenty of space should be given for answers. It 
has become a practice to print probable answers in the 
questionnaire itself. The informant has to score out irrelevant 


answers. 


Collection of Secondary Data. Secondary data are those 
which have already been collected and analysed by some other 


agency. Here problem of original collection of data does not 


arise. Secondary data are— 
(1) Official publications of 
governments. 
(2) Reports of Committees and Commissions. 
(3) Publications and Reports of trade associations and 
Chambers of Commerce. 
(4) Technical and financial journals like Commerce, 
Economic Times, Financial Express etc. E 
(Б) Research work done by scholars in universities. 
(6) Official publications of different international 
organisations like UN.O., LL.O., ECAFE. etc. 
(7) Market reviews and reports. а 
(8) Articles published by distinguished authorities on 
the subject. У 
Such data are collected by the collecting agency for its 
own purpose. Statistical analysis in economics and Commerce 
usually depends in part at-least, on data collected and compiled 
by government, trade association, Central Bank etc. But the 


the central, state and local 
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authority of data is no guarantee of reliability of data. Connor 
states that, “Statistics, especially other peoples’ statistics are 
full of pitfalls for the user.” That is why Dr. Bowley warns, 
“Tt is never safe to take published statistics at their face value 
without, knowing their meaning and limitations, and it is always 
necessary to criticise arguments that can be based on them.” 

Sources of Error in Secondary Data. The following are 
the chief sources of error in secondary data :— 

(1) Estimating Errors :—The data may be estimates 
rather than facts. Estimates might have been done due to 
various reasons. Such data will have estimating error. 

(2) Fictitious Statistics :—Statistics may be fictitious. 
“Statistics may be just of specialist designed to prove some 
fact. In order to prove that there has been much progress 
in Five Year Plans or Price-level has not increased a government 
may compile and publish fictitious data. 

(3) Errors due to the use of substitutes :—There may be 
errors due to substitutes, of different items which may not be 
disclosed. 

(4) Errors due to classification :—Data will be classified 
according to the purpose of enquiry. Classification of data will 
be different for different purposes. 

(5) Discontimuities т the series :—Such errors аге 
common in the construction of index numbers. А particular 
index number may be discontinued and be replaced by some 
other. 

(6) Self-interest bias :—Such errors will always be there 
in every data. 

(7) Definitional Errors :—Definition of statistical unit 
will differ for different agencies. Before such data are used 
definitions employed must be carefully examined. 

(8) Double Counting :—Double counting results in arti- 
ficial inflation of values. In certain cases double counting is 
essential as for example in input-output analysis. 

(9) Reporting and Transcribing errors :—Any large scale 
reporting and transcribing of numbers will involve some errors. 
In official statistics, agencies generally do not take proper care 
to supply correct information. 

Before using the data collected and published by others. 
one should approach them in a critical manner. He should 
satisfy himself with regard to the following facts— 
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1. Whether the data are reliable :—The reliability of data 
can be tested by finding out the agency collecting such data. 
If agency is dependable and it has used proper methods in 
collecting data, the statistics may be relied upon. 

2. Whether the data are suitable for the purpose :—Data 
which are suitable for one purpose may be unsuitable for the 
other. 

3. Whether the data are adequate :—' The data must be 
adequate to the purpose. Incorrect conclusions will be drawn 
by inadequate statistics. 

The investigator should proceed cautiously in the use of 
secondary data. He should enquire into following points :— 

(a) The integrity and experience of the Collecting 
Organisation. 

(b) The scope and object of the enquiry for which they 
were originally collected. 

(с) The Type of Data—Census or Sample, if sample 
whether random, purposive or mixed. 

(d) The method of collection adopted. 

(e) Measurement Units in which they were collected and 
expressed. 

(f) The extent of accuracy attained. 

(g) The extent to which they refer to homogeneous 
conditions and are, therefore, comparable. 

(h) The time and area covered by the enquiry. 

(i) The stability of the institution or body supplying 
the data. 

(j) The suitability or otherwise of the data for utilising 
for a specific purpose should be carefully examined. 

Representative Data. When it is necessary to collect 
original data, the investigator has to decide whether 
his enquiry will cover all the items of the population 
or a group of representative items. In other words he 
has to decide whether his enquiry wil be а census 
enquiry or a sample enquiry. In some cases census is not 
possible due to either limited time or money or both. Most 
of the statistical enquiries are sample enquires. The results 
derived from sample enquires are as good as the results of 
full counts, provided proper care is taken in selecting the 
sample, Wheatherburn states that, “Їп order to examine a 
ation with respect to a specified characteristic, the 


large popul 
uals from that popula- 


statistician chooses a sample of individ 
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tion, and from the properties of the sample relating to the 
given characteristic, he endeavours to estimate those of the 
population.......... The theory of samplying is concerned first 
with estimating the properties of the population from those 
of the sample and secondly, with gauging the precision of the 
estimates.” Sample enquiry is a scientific method of conducting 
investigations. The basic idea behind a sample enquiry is that 
the items selected will be on the whole, represent fairly, and 
correctly the characteristics of the population from which it 
has been taken. Snedecor says, “А carload of coal is accepted 
or rejected on the evidence gained from testing only a few 
pounds. The physician makes inferences about a patient's 
blood through examination of a single drop. Samples are 
devices for learning about large masses by observing a few 
individuals." 

The fundamental principles on whieh sample enquiries are 
bssed are enunciated in form of the following laws :— 

(а) Law of Statistical Regularity. 

(b) Law of Inertia of large numbers. 

(c) Law of persistence of small numbers. 
(d) Law of decreasing variation. 

(a) Law of Statistical Regularity :—Law of statistical 
regularity is a corollary of the theory of probability. According 
to W. I. King, “The Law of Statistical Regularity formulated 
in the mathematical theory of probability lays down that a 
moderately large number of items chosen at random from a 
very large group are almost sure on the average to have the 
characteristics of the large group.” This law lays down that 
if a moderately large number of items is selected at random, 
from a given population, the characteristics of these items 
will reflect: to a fairly accurate degree, the characteristics of the 
entire universe. If it is desired to find out average height of 
students of a university where there are 10,000 students. It 
will be a difficult task to measure each and every student and 
then find out average height. If a group of say 100 students 
is selected at random and average height is calculated, it will 
be nearly the same, as could be found out by measuring all the 
students. Most of the judgements are based on this assumption. 
Insurance, bank and other such commercial concerns undertake 
risk of business due to this principle. This law will hold good 
only when 

(i) Items are selected at random from the universe. 


1 
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(ii) The number of items is sufficiently large. 

Selection at random means that every item has got an 
equally likely chance of being included in the sample. But it 
should not be inferred from the above that any number of 
samples, no matter how large, will give exactly the same results 
as would be obtained by the use of the entire data. The 
probability of error diminishes constantly as the eames of 
items in the sample increases. 

(b) Law of Inertia of Large numbers :—This law ls a 
Corollary of the law of statistical regularity. It lays down that 
‘large aggregates are more stable than small ones.’ If the 
numbers involved are large the total change is likely to be very 
small. For example, while the amount of wheat produced in 
any one country varies immensely from year to year, but the 
wheat production of the world as a whole remains relatively 
stable for decades. The losses from fire in a single city may 
vary in different years, but the annual loss throughout the country 
will remain constant. Prof. F. С. Mill clarifies this law in these 
words : “While there is variation in nature the degree of such 
variation is limited, there is some uniformity in natural 
processes. When we are dealing with quantitative data this 
uniformity in nature is found in the stability of large numbers, 
as examplified by the curious regularities in such phenomena 
as birth rates or death rates. Nature in other words is not 
marked by utter chaos; principles of regularity, order and 
stability appear in all natural processes and these principles 
are strongly evident when we deal with masses of quantitative 
data.” Thus this law is based upon nature of natural phenomena. 
It states that there exists a uniformity in larger proportions. 

(с) The law of persistence of small numbers :—According 
to this law, if it is found in a group of items that а small number 
or proportion of items exhibit markedly different characteristics 
from the remaining items, this tendency will persist even 
though the items may be increased consiaezably. Any number 
of unbiassed samples from the same population will show the 
same tendency of persistence. This cannot be eliminated. In 
a college having two thousand students, however, brilliant 
students it may admit there will be some who will show very 
poor resuls. If the number of admissions is increased two-fold 
with the same quality still, the proportions of students showing 
very poor results will not vary markedly. This tendency is 
known as the law of persistence of small numbers. 


42 AN INTRODUCTION TO MODERN STATISTICS 


(d) The Law of Decreasing Variation :—According to this 
law variations between the characteristics of individual samples 
and those of the parameters of population as compared to the 
Characteristics of successive combined unbiassed samples of the 
same group will continue to diminish with each enlarged sample. 
This law makes it possible to determine the proper size of 
sample which would represent to a sufficient degree of accuracy, 
the characteristies of the universe. 


Theoretical Questions 


1. Examine critically the important methods of collection 


of statistical data. (B. Com., B.H.U.) 
2. Discuss in brief the methods generally used in the 
collection of primary data. (B. Com., Agra) 


3. Classify the methods generally employed in the collection 
of statistical data and state briefly their respective merits and 
demerits. (B. Com., Agra) 

4. What precautions should be taken in making use of 
published statistics for further investigation? (B. Com., Agra) 

5. “In collection of statistical data, commonsense is the 
chief requisite and experience the chief teacher". Discuss this 
statement with comments. (M.A. Patna) 

6. What is a ‘Statistical Investigation’ ? Describe the 
preliminary steps you would take in planning a statistical 
investigation. (B. Com., B.H.U.) 

7. Describe the various stages in conducting a primary 
economie investigation. What precautions will you take at each 


stage ? (M.A. Punjab) 
8. Discuss the main steps necessary to conduct a family 
budget enquiry in an industrial town. (M.A. Agra) 


9. How would you conduct an enquiry about 'Payment of 
Wages in an industry’ ? On what points would it be necessary 
for you to be clear before actually beginning investigation work ? 

(M. Com., Agra) 

10. How would you organise a marketing survey of the 
fruit trade in a particular region with a view to making suggestions 
for its development ? Explain the procedure you would follow 
step by step. (M. Com., Agra) 

11. What is a Questionnaire ? What precautions should be 
taken in drafting a questionnaire ? 

19. What is a Statistical Unit ? Is it necessary that the 
data be homogeneous ? (B. Com., Agra) 

18. Draw up a suitable questionnaire for surveying the 
economie aspects of any cottage industry in which you may be 
interested. Briefly indicate how would you proceed to collect 
relevant materials ? (B. Com., Luck.) 
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14. You have been appointed Secretary of a Committee to 
conduct a statistical enquiry to measure the success or otherwise 
of the Second Five Year Plan in M.P. How would you proceed ? 


Give details. (M. Com., Vikram) 
15. State the Law of Statistical Regularity and explain how 
it is useful in making investigations. (B. Com., Agra) 


16. Distinguish between a Census and a sample enquiry 
and briefly discuss their comparative advantages. Which of these 
methods would you prefer for calculating the total wages of 
workers in a given industry ? (M. Com., Agra) 


CHAPTER 4 


ACCURACY, APPROXIMATION 
AND ERRORS 


After collection of data, the next task to be performed is 
that of editing the data. Data must be scrutinized and the 
information given in a careless manner should be eliminated. 
Patient and critical reading of the returns is a time consuming 
but necessary phase of an investigation. It should also be 
examined whether the data collected are appropriate or not 
Editing requires a high degree of skill. The accuracy of the 
results of an enquiry depends upon the technique of editing 
data. The task of the investigator is to get data of the greatest 
possible accuracy. Editing results in approval, rejection, return 
or revision of records. It calls for marked ability, scrupulous 
care, sound judgement and utmost candour. In editing data 
following facts are given due consideration— 

(1) Accuracy 
(2) Approximation 
and (8) Errors. 
Accuracy х 

"Statistics is regarded as a science of estimates and 
probabilities". In estimates perfect ассигасу is unattainable. 
We give importance to reasonable accuracy. It is wrong to think 
that figures expressed in numerical terms are exact figures. 
Figures of profit and loss account or balance sheet should not 
be compared with figures of some economic or social phenomena. 
Later are only approximations of the real value. In exact 
Sciences like physics or mathematics etc. perfect accuracy may 
be attained but this is not possible in less exact social sciences. 
The reason for not attaining perfect accuracy are (1) Imperfec- 
tion of the investigator and (2) Imperfection of the instruments 
and tools of measurement. Man is not perfect, moreover he has 
his own feelings of pride, bias ее. When he collects facts, 
naturally his mind also works according to his feelings. Acts 
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have a very high degree of correlation with our feelings of 
heart and mind. Secondly the statistical tools and instruments 
are not as perfect as the tools of a scientist's laboratory. 

Therefore we should not expect perfect accuracy in 
statistical investigations. It is neither possible nor required. 
In statistical investigations, there will be prejudices—may be 
deliberate or unconscious. Experimentation is not possible. 
Under such circumstances we can arrive at a reasonable 
standard of accuracy only. But nonetheless, the science of 
statistics helps us in understanding the factual world with all 
its inaccuracy and imperfections. We have to be content with 
a reasonable degree of accuracy which depends upon its practical 
value in relation to its cost, Statistician may be satisfied with 
a certain conventional accuracy. 

Accuracy is a relative term. An iron-dealer may weigh the 
iron correctly say to a kilogram, but such is not possible for 
a goldsmith. Similarly a cloth merchant may measure cloth 
correctly to an inch or so, but such accuracy is not possible 
while measuring land. Therefore accuracy depends upon 
purpose and object of enquiry. In statistics we need only such 
type of accuracy. 

In statistical investigations we aim not mathematical 
accuracy but statistical ! accuracy. Such mathematical 
accuracy is not of much use in statistics. For example we may 
find out the revenue derived by the government from Income 
Tax to the nearest Nayapaisa, but these figures will not add 
weight to comparison to be made between tax revenue for a 
number of years. Therefore time, money and labour are saved 
by remaining satisfied with reasonable accuracy. Perfect 
accuracy is of little use in statistics. We should aim towards 
greater accuracy in further investigations. Every preliminary 
investigation with its difficulties and weaknesses serves аз а 
basis for future work of a more accurate nature. Bowley says, 
“In the present state of our knowledge, many statistical measure- 
ments cannot be made with precision for want of data and a 
critic is inclined to say that for this reason preliminary estimates 
are valuless ; but from the scientific point of view this criticism 
for a faulty measurement made or logical principle is 
none, if limits can be assigned to its possible error 


lead to others with progressive improvement.” 
are usually estimated by 


is wrong, 
better than 


and may 
The characteristics of mass data 
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statistical techniques and perfect accuracy is unnecessary, 
unwanted and usually unattainable. 


Approximation 


Most of the statistical data are approximate figures. The 
process of approximation makes it simple to understand the 
significance of data. It enables grasp of figures easy and clear 
and facilitates caleulation and comparison. If it is stated that 
Indian population is 43,83,35,735 it will be difficult to 
remember this figure. On the other hand if it is said that 
Indian population is about 43.8 crores, it will become simp!c and 
will easily be remembered. Thereby there will be no loss ©. its 
importance. The extent of approximation depends upon the 
degree of accuracy desired. When approximations are made, 
the figures should be rounded in such a way as to indicate 
precision about facts. 

The various methods of approximation are— 

(1) By rounding the figures in 100 or 1000 or 100,000. In 
this ease certain digits are left entirely. e.g. 

(rounding to the nearest 000) 


58,254 would become 58,000 
57,835 5: 57,000 
9,235 » 9,000 
9,693 Т, 9,000 


(ii) By raising the actual figure to the next higher whole 
number. e.g. 


58254 would become 59,000 
51853 5 58,000 
9235 А 10,000 
9693 $ 10,000 


(iii) By approximating to the nearest whole number. In 
this method, the value is unchanged when the remainder to be 
dropped is less than one half. It is raised to the next higher 
digit if the remainder exceeds one half. e.g. 


58254 would become 58,000 
57885 cn 58,000 
9235 И 9,000 
9698 » 10,000 


Similarly percentages can also be rounded like 
15.8295 may be called 15.896 
14.69% » 14.7% 


FE 


By m а 
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When figures have been rounded up, a note to this effect 
should be given at the end of the data. 


Errors 


Statistical data are obtained either by measurement or by 
observation. Consequently it is never possible to ensure perfect 
accuracy. This is a limitation inherent in statistical studies. 
In measurement say of heights of students, no body can guarantee 
that there will be no error even to the extent of say 0.001". It 
is also not possible to assure absolute accuracy while making 
estimates about a population from investigation of a sample. 
These inaccuracies are called in Statistics as ‘Error’. Error in 
the science of statistics does not mean mistake. It means the 
difference between the observed value and the true value. If 
there are 50 students in a room, and if we count them as 52, it 
will be a mistake but on some basis if we approximate them to be 
52 it will be called an error in statistics. 

Sources of Errors. Statistical Errors or difference between 
true and observed values may arise due to the following 
reasons :— 

1—Errors of origin—Errors may arise due to the bias in 
the information collected i.e. defect of definition of the subject 
matter of enquiry or the units of collection. For example, if 
in an enquiry about the age of a group of persons, the informa- 
tion is collected by the age next birth-day, there will be an error. 
This is due to defect in the definition of the unit of collection. 
Similarly errors may also arise from defective definition of 
the subject matter of enquiry. For example, if the purpose is 
to enquire into the family living condition of the working class 
people, but if due to some defects in the definition of working 
class people, some middle class families are also included there 
will arise some errors. 

2 —Errors of inadequacy—Errors may also arise due to 
smallness of the sample selected. If sample contains items 
which are not representative of the universe, errors will result 
consequently. 

8—Errors of Manipulation—Errors may also arise due to 
mistakes in calculation, counting, measurement, description, 
classification, approximation or clerical lapses. Such errors are 


caused unconsciously. 
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Measurement of Errors. Measurement of errors may be 
cither (a) Absolute and (b) Relative. ў 

The difference between the true value and the estimated 
value of an item is called absolute error. To put it in a 
formula form 


АЕ = A—E 

where AE = Absolute Error 
А = Actual value 
E = Estimated value 


If the number of students in a college is estimated at 2200 
while actually it is 2300, then the absolute error will be 
AE = A—E = 2300 — 2200 = 100 
Absolute error may be either positive or negative. If 


estimated value is more than the actual value it will be negative 
and if estimated value is less than the actual value it will be 


positive. Symbolically 
A >E = Positive absolute error 
A <Е = Negative absolute error. 
If in the above example estimated number of students is 
2400, then the absolute error will be 
AE — A— E = 2300 — 2400 = — 100 
Relative error is the ratio of the absolute error to the 
estimated value. Or 


< 


RE A—E 
ШАР 
"La 2300— 2200 100 
In the first case RE— —— BTE I —'045 
2200 2200 
d 4 sais BUD ee jm 
n the second case eS Yigg = um 


Relative error may also be expressed as percentage error. 

Thus 
REx100—Percentage Error. 

In statisties relative error is more important than absolute 
error. It can be illustrated by ап example. If monthly 
incomes of two persons A and B are 50 and 500 respectively 
and they are estimated at 45 and 450. The relative error 
will be | 
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For A Б. For B 
50—45—5 - 500—450—50 
But relative error will be 
50—45 500—450 
—"111 A PECES SET 
45 450 
Ог 111% 111% 


Biased and Unbiased Errors, Biased errors are those 
which are due to bias or prejudice of the enumerator or defect 
in the measuring instrument. An error which goes on 
accumulating may be called a biased error. It is also called 
accumulative error. If an investigator always over-estimates 
or underestimates, it shows a bias in his mind. Such errors 
affect on one side only, therefore their effect is cumulative, 

Unbiased errors are those which arise in due course. 
Unbiased errors cancel each other and hence they are also called 
‘Compensating errors’. Such errors affect on both side, thus 
their effect is compensating. Such errors tend to offset each 
other and leave little effect on the general result. 

Biased errors may also be due to bias in approximations. 
Such errors may also arise due to prejudice of the reporter 
for example if a number of women are asked to state their 
age, they will always tell lower age, than their actual age, 
similarly an administrative officer always over-estimates staff 
requirement of his department. Thus with increase in the 
number of observations the biased error increases but the larger 
the number of observations is, the smaller is the unbiased error. 
That is why it is said that, “of the biased errors we из 
have none and of the unbiased errors, the more the merrier.” 


Calculation of unbiased Error 


Unbiased Biased | Я 
пЫа 3 lI. ^ А 
taking digits | taking higher 
paris (00) | than 000) ~ in 000 
17,118 17 12. $ 18 
618 | Tsh 0 | : 
1,258 1 1 | : 
8,862 8 8 | 
15,448 15 703 is | 10 
7,645 8 n: Ei 
11,759 12 І | к 
10,509 1 А E | 4 
2,480 2 | 
Total 75,182 | 75 71 80 


A 
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Absolute Error (Unbiased) 75182—75000—182 
Absolute Error (Biased) in I Case 
75182—71000— 4182 
in II Case 75182 —80000— —4718 


Unbiased error—Average absolute Error of items Mi 


In unbiased rounding up a figure more than 500 has been 
taken as equal to one thousand and less than 500 has been ^ 
ignored. Thus maximum and minimum limits are 0— 499, 
Their average is equal to 249°5 or 250. 

Unbiased error will be 250% „/л= + 750 

The actual error is less than this. 


Calculation of biased error. 
Biased error— Average absolute error of items)<N 


In the above сазе limits of error are 0—1000 average 
thereof is 500 —500»«9- + 4500 


Biased errors of 4182 and —4718 are very near to this 
figure. 


Possible error and Probable error. Possible error is given 
by the limits within which the actual error must lie. In the 
above example while rounding up, less than 500 is ignored 
and 500 and more than 500 is taken equal to 1000. The possible 
error in this Case is + 500. If rounded figure is 18000 then 
actual figure will lie between 17500 to 18499. 


Probable error and standard errors are used in the process 
of sampling. This provides a limit within which sample might 
have deviated from the true population statistics due to random 
sampling. 


Theoretical Questions 


1. In what way does a 'Statistical Error differ from a 
‘mistake’ ? What classes of errors are there and how they may 
be measured ? (B. Com. Alld.) 

2. Discuss the standard of accuracy required in statistical 
calculation. To what extent should approximation be used ? 

(M. A. Agra) 

3. In any sample survey there are many sources of errors. 
A perfect survey is myth’. Discuss this statement. 

(M.A. Agra) 

4. Of the ‘Biased Errors’ the statistician should have none, 
but of the ‘Unbiased ones the more the merrier’ notwithstanding 
that they are also errors. Elucidate. (B. Com. Alld.) 
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5. Mention the advantages of approximation in Statistics. 
What degree of accuracy is generally required in each statistical 
investigation ? (M. Com. Raj.) 

6. (a) Discuss the main sources of error in Statistics and 

their effects. 
(b) State the various methods of approximation and their 
utility in Statistics. (B. Com. Agra.) 

7. Distinguish between— 

(a) Absolute and Retative errors 
and (b) Biased and Unbiased errors. 

Discuss the effects of these errors and explain the steps that 
are taken to meet the effects. (B. Com. Agra) 


СНАРТЕЕ 5 


CLASSIFICATION AND 
TABULATION 


"In no investigation of any size is the volume of collected 
data or material so small that it may rapidly or easily assimilated 
by a perusual of the completed forms. The statistician's first 
task is to reduce and simplify the details into such a form that 
the salient features may be brought out, while still facilitating 
the interpretation of the assembled data. This procedure is 
known as classifying and tabulating the data.” А. В. Irgnsic 

The collected data require the exercise of considerable 
vigilance on the part of the investigator. The collected data 
are too voluminous for location of facts. These are to undergo 
further processing. Data as such are not directly fit for 
analysis or interpretation. The unwieldy, unorganised and 
shapeless mass of collected data are not capable of being rapidly 
or easily assimilated or interpreted. Data are to be provided 
with a form or structure. Data have to be boiled down to 
make them fit for digestion. For interpretation and analysis, 
it is essential that the data are in a condensed form. Hence 
data should be treated statistically. 


Classification. The first step towards further processing 
of statistical data after collection is classification. For instance, 
when the census is taken a mass of matter in the form of 
schedules will be collected giving information about age, sex, 
civil condition, education, employment ete. But it will need 
proper classification before this mass of statistical data may be 
made useful. Groups will have to be made by taking into 
account common characteristics or attributes of the various 
units under consideration such as age, sex etc. Classification 
is thus the process of arranging the available matter in groups 
or classes according to resemblance and likeness. Data are 
classified according to some common features or object in view. 
According to Prof. Connor, "classification is the process of 
arranging things (either actually or notionally) in the groups 
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or classes according to their resemblances and affinities.” In 
the process of classification items having common characteristics 
have to be brought together. Likes go with likes or likes are 
separated from .the unlikes. The data may be classified 
according to the following basis :— 


(a) Qualitative, or classification by attributes, when the 
classification is made according to quality or 
attribute. Data may be classified according to some 
attribute as, blind and not blind, literate or not 
literate. These are qualities which cannot be 
directly measured. 

(b) Quantitative or Numerical, where classification is 
made according to quantity, if data of sales are 
classified according to the quantity or value of the 
commodity sold, it will be a case of quantitative 
classification. 

Objects of classification. The chief objects of classification 
are :— 

1—To eliminate unnecessary details. 

2—To bring out clearly points of similarity and 

dissimilarity. 

3—To enable one, to form mental picture of objects ‘of 

perception and conception. 

4—To enable one to make comparison and draw inferences, 

and location of facts. 

5—To enable summarisation of data possible and easy by 

presenting the data in a suitable form. 

Though a lot of advantages is derived from classification, 
some amount of details is lost in the process of summarisation. 
The greater is the extent of summarisation, the more is the 
loss of details. The statistician will have to weigh and balance, 
the advantages he would derive from summarisation, the loss 
of details he will have to suffer, and decide on the extent of 


summarisation he will have. 

Characteristics of classification. When we make a classi- 
fication, we break up the subject matter into a number of classes. 
It is important that the classification should possess following 
characteristics :— 

i—Emxhaustive :—The classification system must be exhaus- 
tive. There must be no item which cannot find a class. There 
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must be a place for each item of data in one of the classes. 
If classification is made exhaustive there will be no place of 
ambiguity. For example, a classification of persons by conjugal 
condition having two classes—'married' and ‘single’ is not 
exhaustive. There will arise doubt in several cases where 
persons are widowed or divorced. Therefore in order to remove 
doubt, classifieation should be made exhaustive by making 
classes like ‘пеуег married' *widowed' 'divorced' 'separated' 
‘married’ and “Not married but living like married”. 


2—Mutually Exclusive :—The classes must not overlap. 
That is each item of data must find its place in one class and 
one class only. There must be no item which can find its way 
into more than one class. 


3—Stability :—Classification must proceed at every stage 
in accordance with one principle, and that principle should be 
maintained throughout. If a classification is not stable and is 
changed for every enquiry, then data would not be fit for 
comparison. 


4—Flexibility :—A good classification should be flexible 
and should have the capacity of adjustment to new situations 
and circumstances. 


Classification of data by attributes. When data are 
classified according to attributes, the presence or absence of 
an attribute is the basis of classification. If we consider one 
attribute, two classes are formed—one possessing that attribute 
and the other not possessing that attribute. For example if 
data are classified according to attribute—‘literacy’ then there 
will be two classes, ‘literate’ and ‘illiterate’. Similarly classi- 
fication can be made like Males-females, Married not married, 
employed not employed and so on. Such a classification, where 
two divisions are made on the basis of one attribute, is known 
as Simple or Two-fold or Dichotomous Classification. 


If these two classes are further sub-divided on the basis 
of other attributes, then it is called Manifold classification. For 
example a population may first be divided into two classes on 
the basis of sex. Further these two classes may further be 
divided on the basis of married conditions, and on the basis 
of this each sub-class will be divided into two classes. Again 
each sub class may further be subdivided according to attribute 
literacy, as illustrated in the following chart. 
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Population (Manifold-Classification) 
Мы аана. ОЕ 
MALE FEMALE 
» ә ( 
| | 
ie ane мые Unmarried 
i к= 


) | | | 
Literate Illiterate Literate Illiterate йш Шаке 


: tado йы 
Classification by class-intervals or groups. The classifica- 
tion by groups or class intervals is possible only in those cases 
where the direct quantitative measurement of data is possible. 
Data pertaining to height, weight, income, export production 
ete come under this category. Such a quantity which can 
assume a range of numerical values is known as variable and 
each value within the range is called a variate. Suppose we 
have to study the expenses of a thousand students of a certain 
college. Data will be the amount of expense incurred by every 
student to the nearest rupee. We will have 1000 figures ranging 
from say Rs. 50 to 200. "These figures will have to be grouped 
in such a way that the mass of data becomes precise and easy 
to understand. In this case we can group students according 
to their expenses within a certain range say 25. The classes 
then will be, 


Expenses in rupees No. of students 

50— 75 200 
75—100 150 
100—125 800 
125—150 150 
150—175 150 
175—200 50 

Lowest value 

Largest value Total 1,000 


Magnitude of class intervals :—In the above example 50, 
75, 100, 125, 150 and 175 are the lower limits of the respective 
classes and 75, 100, 125, 150, 175, 200 are the upper limits of 
the respective classes. The difference between upper and lower 
limits of a class is magnitude of the class. In the above 
example the magnitude of the classes is 75—50—25. What 
should be the magnitude of theclass ? Will depend upon number 
of observations and number of classes to be formed. For 
example if number of observations is 1000 and it is desired to 
group them in 6 groups, then magnitude of the classes will be— 


= AN INTRODUCTION TO MODERN STATISTICS 


Largest value—Smallest value г 
No. of classes to be formed ^ ПА ЧЧ. 


200—50 
In the above case Tw onm —25, and classes can be formed by 


adding this magnitude to the lowest value like, 50—75, 75—100, 
100—125, 125—150, 150—175, апа 175—200, 

Frequency :—Number of observations falling within a 
particular class interval is called frequency of that class. In 
the above example frequency of class 50—75 is 200. 

Limits of Class Intervals :—There are two ways of forming 
class-intervals—(1) Exclusive Method and (2) Inclusive Method. 
Under the exclusive method the upper limit of one class interval 
is the lower limit of the next class e.g. 0—10, 10—20, 20—30, 
30—40 and so on. In this case all items which measure less 
than 10 will be included in the class 0—10. Items measuring 
exactly 10 will be included in the next class of 10—20. 

There is another method of framing the class intervals. 
In this method, known as Inclusive method, ambiguity about 
items identical to a limit of the class interval is sought to be 
removed. Under this method the above class intervals will 
become 0—9, 10—19, 20—29, 30—39. То remove the difficulty 
of an item which is not a complete number and falls between the 
upper limit of a class and the lower limit of the next class, the 
above class may be expressed according to inclusive method also 
as 0—9.5, 10—19.5, 20—29.5, 30—39.5. This method is not 
much in use for want of continuity. These method may be 
illustrated as follows :— 
es P RP CIE КП 


A B Cc 
(Exelusive Method) | (Inclusive Method) | (Inclusive Method) 
Class Frequency | Class Frequency | Class Frequency 
Intervals Intervals Intervals 

0 —10 8 0—9 8 0 — 9&5 8 
10—20 20 10—19 20 10—19'5 20 
20—80 36 20—29 86 20—29'5 36 
30—40 50 30—39 50 80—39 `5 50 
40—50 80 40—49 30 40—49'5 30 
50—60 ' 16 50—59 16 50—59°5 16 
160 160 160 


mmm mE 


CLASSIFICATION AND TABULATION 157 
Class intervals may also be written as т 
Exceeding Not exceeding Frequency 

0 10 8 
10 20 20 
20 30 36 
30 40 50 
40 50 30 

50 60 16 4 
160 


Sometimes in place of words ‘Exceeding’ ‘not. Exceeding’ 
words ‘More than’ but ‘Less than’ are also used. 


There is another way of giving a series. Sometimes full 
class intervals are not given, only mid values are given. For 
example series may be written like. 


Mid value Frequency 

5 8 
15 20 
25 36 
85 . 50 
45 зо 
55 16 

7160 


To convert such a series into that of class intervals, following 
Steps should be taken. © 


(1) Find out the difference between two mid janes. 


(2) Difference should be halved and by deducting half the 
difference from the mid value, lower limit of the class will be 
found out and by adding half the difference to the mid value 
upper limit of the class will be found out. 

In the above example difference between two mid values is 
of 10. Half of this ae =5 when deducted from mid value 
5—5—0 is the lower limit and when added to mid value 5--5—10 
we get upper limit of the class. 
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In certain cases ‘ореп end' class intervals are given, аз 
in the example given below :— 


Class Intervals Frequency 

Below—10 8 
10—20 20 
20—80 86 
80—40 50 
40—50 30 
Above 50 16 
160 


In such cases we put value on the basis of construction of 
the series, Here ‘0’ in place of below and 60 in place of above 
may be put. 

Data are sometimes given in unequal class intervals. The 
series may take following shapes also— 


A B C 
Frequency Frequency 
0 —5 X 2 X 2c X 
5 —10 М 5 M 2 —6 ХЬ-У 
10—20 Z 7 7 2—8 X+Y+2Z 
20—30 А 7 —20 А 8 —10 А 
80—50 В 20—40 В 10—12 В 
50—75 С 40—60 C 12—14 C 


Such series are used when there is great fluctuation in data. 


Cumulative Series. A series may be given the following 
forms also— 


MM —— a —— HM —— 


А В 


Below or less than Frequency |Above or More than Frequency 


or Not exceeding or Exceeding 

ev datio 8 0 160 

2310220 28 10 152 

ЗО 64 20 132 

she 114 80 96 

аат, 144 40 46 
60 160 50 16 
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The above series are cumulative frequency series. 
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In the 


table ‘А’, class frequency will be found out by deducting the 
cumulative frequency of the class from the cumulative frequency 
of the next class. For example— 


Class interval 
0 —10 
10—20 
20—30 
80—40 
40—50 
50—60 


8 

(28 — 8) 
(64 — 28) 
(114— 64) 
(144—114) 
(160—144) 


Frequency 


Total 1 


8 
20 
86 


60 


— 


In the table ‘В’ class frequency will be found out by 
deducting the cumulative frequency of the next class from the 


cumulative frequency of the class. For example— 


Class interval 
0 —10 
10—20 
20—30 
30—40 
40—50 
50—60 


Arrangement of frequencies 


question arises, how to find out 


different groups from a large number of individual observations. 


160—152 
152—132 
132— 96 
96 — 46 . 
46 — 16 
16 — 0 


Frequency 


8 
20 
36 
50 
30 
16 


you wg 


160 


in class intervals. Now 
number of frequencies of 


This is done with the help of tally sheets as illustrated below :— 


Below are given monthly 


data in а continuous ser 


and so on. 


rents of 300 houses. 
ies of interval of 10. like 20—80, 80—40 


Put the 
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80, 65, 23, 46, 41, 27, 58, 44, 57, 34, 60, 28, 56, 63, 51, 53, 61, 50, 65, 54 
70, 50, 74, 55, 67, 51, 55, 71, 68, 51, 39, 71, 41, 72, 45, 24, 58, 48, 79, 42 
90, 71, 46, 81, 28, 60, 53, 75, 40, 66, 80, 33, 73, 86, 56, 47, 97, 76, 51, 81 
49, 78, 47, 71, 49, 73,04, 44, 95, 27, 45, 87, 31, 89, 39, 59, 28, 83, 44, 70 
80, 46, 85, 54, 28, 85, 50, 63, 52, 73, 64, 62, 52, 61, 50, 60, 55, 53, 56, 67 
51, 71, 98, 29, 56, 47, 69, 77, 49, 69, 45, 79, 37, 40, 91, 42, 59, 59, 94, 55 
62, 36, 53, 41, 58, 53, 66, 55, 57, 34, 02, 25, 76, 72, 44, 54, 83, 72, 54, 98 
89, 80, 65, 53, 58, 50, 68, 88, 90, 43, 57, 55, 47, 58, 58, 64, 80, 48, 61, 84 
63, 52, 54, 57, 78, 42, 58, 46, 67, 34, 54, 48, 58, 52, 36, 55, 82, 66, 46, 52 
45, 52, 75, 54, 70, 44, 65, 49, 56, 76, 61, 49, 45, 86, 49, 62, 55, 48, 60, 46 
51, 52, 97, 69, 24, 25, 55, 88, 29, 52, 36, 62, 48, 54, 56, 65, 40, 79, 63, 61 
55, 54, 33, 52, 68, 54, 50, 60, 26, 57, 51, 41, 52, 70, 45, 67, 49, 89, 50, 54 
44, 60, 56, 56, 55, 39, 49, 62, 46, 66, 51, 40, 56, 72, 64, 58, 52, 49, 45, 59 
57, 55, 31, 55, 64, 75, 28, 57, 81, 48, 29, 57, 75, 57, 59, 48, 55, 64, 50, 48 
48, 85, 56, 76, 33, 57, 47, 49, 84, 38, 52, 34, 55, 54, 41, 44, 52, 47, 85, 58 


TALLY SHEET 
Tally Marks 


TA EH URB. ГЕТ DNI IS DS PU TR DIU. P ТАИ IN. A 
ЛТД 


Statistical Series. According to Prof. Connor, *If two 
variable quantities can be arranged side by side so that 
measurable difference in the one correspond with measurable 
differences in the other, the result is said to form a Statistical 
series.” There are three kinds of statistical series :— 


1—Time Series :—In a time series data are presented with 
regard to time. For example data regarding some particular 
phenomena are given with reference to some time unit like year 
month week or day. The following is an example of a time 
Series :— 
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Year Population of the country 
in millions 
1901 2855 
1911 249°0 
1921 248'1 
1981 275'5 
1941 312'8 
1951 856°9 
1961 438'0 


2—Spatial Series :—When data are presented with 
reference to space it is called spatial series. The following is 
an example of spatial series. 


State Population 
per sq. mile 
UE '560 
Mae: 170 
Bihar 571 
Rajasthan 121 Р 
Andhra 317 
Bombay 327 
Kerala 1,013 


9— Condition Series :—If data are presented with 
reference to some condition, it is called a condition series. 


Below is given an example of this. 


Industry Amount in crores of rupees 


(Block capital) 
Cement 93'0 
Sugar 51'0 
Paper 54'0 
36 0 


Cotton, Jute, ete. 
There are three types of statistical series viz— 


(1) Continuous series 
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(2) Discrete series 


(3) Series of individual observations 


1—Continuous series is one where measurements are only 
approximations and are expressed in class intervals i.e. within 
certain limits. It is like— 


Marks obtained No. of students 
0 — 20 10 
20— 40 15 
40— 60 30 
60— 80 15 
80—100 10 
80 


2—Discrete series is one where the items are of exact 
magnitude. No approximation is needed, class limits are not 
used. Definite breaks are visible between the different items. 
It is like— 


Wages per day No. of workers 
(Rs.) 
1 50 
150 
3 50 


о № 
rw bw 
a ә 


8—Series of individual observations is a series where items 
are listed singly after observation, as distinguished from listing 
them in groups. If marks of 80 students in a particular subject 
are given individually, it will form a series of individual obser- 


vations. Following is an example of such a series. 


CLASSIFICATION AND TABULATION 63 


Marks obtained by 50 students of a University in 


Statistics 

© 5 т - 

ИШЕТЕЛ ЫНАЛ nae a p 
ааа яя ая 
ЕЩ = © a 8 д Š ae |g a 
1 90 | 11 | 50 | 21 | 50 | 81 | 79 | 41 | 78 
2 so [12| 55 | 22 | 60 | 82 | 58 | 42 | 76 
8 20| 13 | 90 | 28 | 70 | 82 | 88 | 48 | 50 
4 10 | 14 | 87 | 34| то | 84 | 88 | 44 | 52 
5 95 | 15 | 86 | 26 | 25 | 86 | 80 | 45 | 58 
6 55 | 16 | 55 | 26 | 90 | 86 | 50 | 46 | 60 
7 | в | 124-60 | 27 | 88 | 87 | 46 | 477160 
8 во | 18 | 89 | 28 | 42 | 88 | 52 | 48 | 1O 
9 30 | 19 | 87 | 29 | 45 | 89 | 60 | 49 | 65 
10 вв | 20 | 98 | 30 | 60 | 40! 70 | 50 | 69 


These data may also be arranged in an ascending 
or descending order 


(Marks arranged in Ascending Order) 


jo 10 20 25 25 80 33 33 33 35 
37 39 42 45 45 50 50 50 50 52 
52 58 58 ‘55 55 55 60 60 60 60 
60 60 60 65 69 70 то 70 76 78 
79 80 80 80 87 88 90 90 90 98 


98 90 90 90 ss 87 80 80 80 79 
78 76 70 70 70 69 65 60 
60 60 
52 50 50 50 50 
35 88 38 33 30 25 25 


TABULATION 


After classification, data are arranged in an orderly manner 
which will help reveal and properly emphasise their character- 
istics. А statistical table is a systematic organisation of data in 
columns and rows. A table summarizes the data by using 
columns and rows entering figures in the body of the table. 
According to H. Secrist, ^Tables are а means of recording in 
permanent form the analysis that is made through classification 

те similar and should 


and of placing in juxtaposition things that a 
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be compared." Prof Connor states that, “Tabulation involves the 
orderly and systematie presentation of numerical data in a 
form designed to elucidate the problem under consideration." 
Hence tabulation is a scientific process involving the presenta- 
tion of classified data in an orderly manner so as to bring out 
their essential features and chief characteristics. А statistical 
table may be either primary or derivative. A primary table is 
one which is prepared on the basis of data actually collected. 
A. derivative table on the other hand is one which is prepared 
on the basis of statistical derivatives like rate, ratio, percentage 
coefficients etc. 


Forms of Tables. There are four methods of tabulation :— 
(1) Single Tabulation 
(2) Double Tabulation 
(3) Treble Tabulation 
(4) Manifold Tabulation 
1—A simple table is one in which data are presented 
according to one characteristic only. It is also called a Table 


of the First order. The following is an example of a simple 
table :— 


TABLE 
No. of students in different colleges in Vikram University 


————— M — 


Class No. of students 
M.A. 2,000 

M.Sc. 500 yl 75. 
M.Com. 500 

B.A. 4,000 

B.Sc. 2,000 

B.Com. 1,500 

Total 10,500 
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Double, Treble and manifold classification is based on sub- 
divisions of the data into different columns according to other 
characteristics. 


The following is an example of double tabulation. 


No. of Students 


| 
B.Sc. 
B.Com. 
Total 


If one more characteristics is added, it will become a treble 
tabulation. 
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In a manifold tabulation many characteristics of the data 
are shown separately. 


Illustration. 


Prepare a blank table to show the distribution of population. 
aecording to sex and four religions in five age groups in five 
important cities of U.P. (B. Com. B.H.U.) 


The table may be prepared in the way given in the front page. 


Tables may also be classified according to the purpose of 
their construction. On this basis a table may be either a general 
purpose table or a special purpose table. A table which merely 
records the facts without any special analytical purpose may be 
called a ‘general purpose’ ‘repository’ or ‘reference’ table. 
Government and Reserve Bank ete. collect a wide variety of 
statistical information and publish their findings in reference 
tables which are available to others who wish to use them in the 
analysis of special problems. When a statistician uses the 
figures found in reference tables, he will select the data which 
are pertinent to his problem and establish new classifications or 
groupings, if necessary, to throw light on the particular problem 
at issue. These selected or regrouped data are then put ina 
table form and referred to in the analysis. Such tables are 
called ‘Special purpose’ ‘Summary’ or ‘Text’ tables. 

Rules for Tabulation. Drawing up a good table is an art 
which requires practical experience. Mr. Harry Jerome has 
stressed the role of ingenuity and the importance of mastery 
of technique in the construction of tables in the following 
words, “A good statistical table is not a mere careless grouping 
of columns and rows of figures ; it is a triumph of ingenuity 
and technique, a masterpiece of economy of space combined with 
a maximum of clearly presented information. To prepare a 
first class table one must have a clear idea of the facts to be 
presented, the contrasts to be stressed, the points upon which 
emphasis is to be placed and lastly a familiarity with the 
technique of preparation.” 

There are no hard and fast rules for tabulation. In the 
words of Prof. A. L. Bowley, “In collection and tabulation 
commonsense is the chief requisite and experience the chief 
teacher.” However certain rules of procedure may be laid 
down for the guidance of tabulators. These rules may be 


divided into two groups. 
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(a) Rules relating to Table Structure 
(b) General rules. 
(a) Rules relating to Table Structure 


1—Number :—A statistical table must have a number, so 
that it may be easily identified. 


2—Title :—There must be a title of a table placed above 
it. Title should be brief, clear, concise and self-explanatory. 
A title explains in brief and concise language (i) what the 
data are (ii) where the data are (iii) the classification principle 
and (iv) time period of data. A satisfactory title answers the 
four questions : What ? Where ? How classified ? and when ? 


3—Captions and Stubs :—The left hand column and its 
heading is called ‘Stub’. Stub items should as far as possible, 
not take more than a line. They should be explanatory. 
Captions are the headings of other columns. They should be 
clearly defined and put in the middle of the column. The 
wording of the heading over each column and at each stub should 
be as brief as possible. The box over the stub on the left of the 
table should contain words descriptive of the stub contents. 
Columns should be numbered. 

4—Ruling and Spacing :—In order to give neat and tidy 
appearance to the table, there should be proper ruling and 
spacing. The size of the table should be adjusted according to 
the space available. Items should not be jumbled together. 
Major and minor items should be given space according to their 
relative importance. If the table is complex and cannot be 
broken down suitably, thick and thin rulings, heavy printed 
sub-titles and coloured inks will all help in clarifying the overall 
picture. 

5—Averages and totals :—Averages are usually placed at 
the bottom of the numbers averaged. To render it more useful, 
the table should contain sub-totials for each separate classification 
of data and a general total for all combined classes. When 
both totals and averages appear in one table, averages usually 
follow the total from which they are computed. 

6—Body of the table :—The tabulator should make the table 
as comprehensive as possible consistent with the purpose in 
mind. Irrelevant matter must be avoided at all costs. Items 
in the miscellaneous or unclassified columns should be reduced 


to minimum. 
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There should be proper arrangement of items in the table. 
There should be a systematic order in the items, Items may 
be arranged in the following way : 


(i) Alphabetically ; (ii) Geographically ; (iii) Chronologi- 
cally ; (iv) Conventionally; (v) Рговтезыуеу ; or (vi) 
Ascending or descending- order. Items of special significance 
which are to be emphasized must be underlined or written in 
bold letters. 


7—Footnotes :—Footnotes are necessary when there is 
some important limitation on the scope of the data or it is 
desired to specify some characteristic of an item. 


8—Source :—Another formal reqirement is the mentioning 
the source of data. The source should be so definitely described 
that any one using the data can trace them to the source without 
difficulty. 


(b) Genenal Rules :— 


1— The table should not be overloaded with details. The 
aim of simplicity should be constantly kept in mind so that 
the story, the data have to tell, may be most clearly conveyed 
to the reader. 


2—Figures may be rounded to avoid unnecessary details 
in the table, but a footnote to this effect must be added. 


3— Table should be adjusted to the space available but 
should not be too narrow or too wide. Columns should be 
carefully planned and unnecessary variations in width avoided. 


4— Columns of figures which are directly comparable 
should be kept together. 


5—Units of measurement should be clearly shown and, if 
necessary be defined. 


6—If certain data аге not available or have been estimated, 


this fact must be given as a footnote. 


7—Size of the table should be such that all its contents 


are visible at a glance. 
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The structure of a table is illustrated below 
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Advantages of Tabulation. The process of tabulation has 


the following advantages :— 


in a manner that those for 


1—The data are presented 
whom they are intended, are able to m 


ake the best use of them. 


a brief and compact manner. 


2—The data are presented in 


The required information can be £ 


data at a moment’s notice. 


athered from the tabulated 


CLASSIFICATION AND TABULATION 71 


8—Comparison is made easy. 

4—The data are arranged in a logical manner. 

5—Items are easy to remember as likes go with likes. 

6—Caleulations are made easy and errors are easily 
detected and corrected. S 


. А few Illustriations 


Illustration —1 


Draw up a table to show the number of wholly unemployed, 
temporarily stopped and the total unemployed, each class being 
divided into males and females, for the following industries : 


Fishing, coal-mining, iron ore mining, cotton manufacturing, 
"ool and worsted, engineering and ship building. 


W holly Temporarily Total 
Unemployed Stopped Unemployed 
Industries ] rum 
Male |Female| Male |Female| Male |Female 
Fishing 


Coal-mining 

Iron ore mining 
Cotton Manufacture 
Wool and worsted 
Engineering 
Ship-building 


Total 
Illustration—2 4 
Prepare a blank table to show the distribution of population 
according to sex and four religions in five age-groups in seven 


important cities of U.P. (B. Com. Agra). 
The table can be drawn as given on page 12. 
Illustration—3 у 


Present the data given in the following paragraph in the 
form of a table, so as to bring out clearly all the facts indicating 
the source and bearing a suitable title. 

According to the census of Manufacturers Report, 1945, 
the John Smith Manufacturing Company employed 400 non- 
union and 1250 union employees in 1941. Of these, 220 were 
females of which 140 were non-union. In 1942, the number 
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of union employees increased to 1475 of which 1300 were males. 
Of the 250 non-union employees 200 were males. In 1943, 
1700 employees were union members and 50 were non-union. 
Of all the employees in 1943, 250 were females of which 240 
were union members. In 1944, the total number of employees 
was 2000 of which one per cent were non-union. Of all the 


employees in 1944, 300 were females of which only 5 were non- 
union. 


Taste SHOWING THE EMPLOYEES or THE JoHN SMITH 
MawvracrUniNG Co., accorpinc то Sex & 
Trape-Union MEMBERSHIP 


| Union-members Non-union Total 
Year| | | ] 

n = w | = “> = | = A- 
а ЕЕ: 
т me La = a ДЫМ 2 | S 

СА | | 

1941 | 1,170| 80 | 1250 260 | 140 | 400 | 1,480 220 | 1,650 
1942 | 1,300 | 175 | 1,475 | 200 50 | 250 | 1,500, 225 | 1,725 
1948 1,460 | 240 | 1,700) 40 10 50 | 1,500) 250 | 1,750 
1944! 1,685| 295 | 1,980 | 15 5 20 | 1,700| 800 12,000 
Illustration —4 


The State of Rajputana had only one city with population. 
of a lakh and over in 1921 and 1931. It was Jaipur. In 1941, 
two more cities—Jodhpur and Bikaner were added. In 1951, 
there was no further additions. It is significant that in 1941 
the two new cities had indentieal population which was exactly: 
the same as that of Jaipur twenty years ago. The total 
population of three cities in 1941 was 430 thousand, of which 
about 4095 (the true figure being 1,76,000) was accounted for- 
by Jaipur alone, the remainder being shared equally by the 
other two cities. In 1941, while Jaipur and Jodhpur added’ 
respectively 26,000 and 32,000 to their 1931 numbers, the 
addition in the case of Bikaner was much larger, being 15 
thousand more than that in Jaipur. In 1921, the combined’ 
population of Jaipur and ‘Jodhpur was just two lakhs and’ 
Bikaner had four thousand souls less than Jodhpur. The effect 
of events like partition and integration of states on the size- 
of these cities was reflected in 1951 census figures. In that 
year, as compared with 1941, Bikaner registered a decline of 
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10,000 persons, Jodhpur an increase of 47,000 souls, and 
Jaipur an increase which was only two thousand less than the 
total population of Bikaner in 1951, or which was five times 
the increase Jaipur registered between 1921 and 1931. It will 
be seen that during the thirty years intervening 1921 and 1951, 
the growth of population in Bikaner was only 48,000 while that 
in Jodhpur was 1,01,000 and that in Jaipur higher still—164 
thousand. It will also be seen that while Jaipur continued to 
enjoy the distinction of being Rajsthan's city No. 1l, Bikaner, 
"which occupied the 2nd place in 1941 along with J odhpur, gave 
‘that place to Jodhpur in 1951 and got relegated to the third. 

From the above account prepare a table showing the popula- 


tion of the three cities for the different census years. 
(M. Com. Raj.) 


PorurATION or Jaipur, Јорнров & BIKANER 
IN 1921, 1931, 1941 AND 1951 


Jaipur Jodhpur Bikaner Total 


1921 1,27,000 73,000 69,000 2,69,000 

1931 1,50,000 95,000 86,000 3,31,000 

1941 1,76,000 1,27,000 1,27,000 4,80,000 

1951 2,91,000 1,74,000 1,17,000 5,82,000 
Questions 


1. Write an essay on the process of collection and tabulation 
of statistical data. 
(B. A. Travancore) 
2. What precautions would you take in tabulating your data ? 
Prepare a blank table to sbow the distribution of population 
according to sex and four religions in five age groups in seven 
important cities of U.P. 
(B. Com. B. H. U.) 
3. "In collection and tabulation common sense is the chief 
requisite and experience the chief teacher."—Bowley. 
What precautions in your opinion are necessary to avoid 
statistical errors in the collection and computation of primary data ? 


(M. А. Agra) 


4. “Classification is the process of arranging things either 
actually or notionally in groups or classes according to their 
resemblances and affinities giving expression to the unity of 
attributes that may subsist amongst a diversity of individuals.” 
Elucidate the above statement. 

(B. Com. Alld.) 
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5. How would you proceed to classify the observations made 
and what points will you take into consideration in tabulating 


them? Mention the kinds of tables generally used. 
(B. Com. Agra) 


6. Discuss the function and importance of tabulation in a 
scheme of investigation. 


Prepare blank tables, showing the distribution of students of 

a college according to age, class and residence for arranging 
(a) Physical training and (b) Tutorial classes. 

(B. Com. Agra) 


7. You are given a statistical table. What question would 
you ask before accepting it? Draft a form of tabulation to 
show :— г ; ‘ 


(a) Sex. (b) Three ranks—Supervisors—assistants and 
clerks. (с) Years 1918 and 1948. (d) Age groups—18 years 


and under, over 18 but less than 55 years, over 55 years. 
(B.A. Madras) 


8. Prepare a table with proper title, division and sub- 
divisions to represent the following heads of information :— 


(a) Export of Cotton piece-goods from India ; (b) To 
Burma, China, Java, Iran, Iraq ; (c) Quantity of piece-goods to 
each country ; (4) Value of piece-goods to each country ; (e) 
From 1939-40 to 1945-46, year by year ; (f) Total quantity 


exported each year; (9) Total value of exports each year. 
(M. Com, Alld.) 


9, Re-arrange the following blank table with a view to 
make it more intelligible :— 


Brahmin | Rajput Vaish Harijan 
NE CY | 9 9 9 
Ф ~~ + о + Ф Ll 

Sex 34535 32€ * "SLE S 
258 8| Баа ш 
О e Чи dT) а 
REIS pes re prd | EUU MS Sea + |= 

Мае | 

Female | | | | | 


(B. Com. АМА.) 


10. What precaution should be taken in tabulation of data ? 
Point out the mistakes made in the following table drawn to show 
the distribution of population according to sex, age and literacy. 
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(B. Com. Luck.) 


11. Arrange the following marks in a frequency table taking: 
the lowest class interval as (10—20) 


(B.A. Andhra) ` 


12. Explain the procedure for processing any raw data into 
a Frequency table. In particular how would you fix the 
magnitude and centres of class intervals ? 


The following is the record of marks obtained by 90 
candidates in an examination. Form a frequency distribution :— 


(B.A. Madras) 


18. Define frequency distribution. State the principles to 
be observed in its formation. 

The following is a record of weights of 70 students (in lbs). 
Tabulate the data in the form of Frequency distribution taking 
the lowest class as (60—69). 
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61 | 78- :98 107 11291479! 078009169" 06 ane 
80 88 96 109 103 84 84 106 91 75 
91 92 102 91 101 90 77 105 90 86 
118 101. 114. 72 17 118 95. 68 99.32 
100 106 87 89 92 107 111 76 83 86 
106 107 62 94 78 108 115 85 98 98 
109 97 74 98 67 82 104 88 88 92 


(В.А. Travancore) 


14. Put the facts of the following extract in tabular form :— 
“The shipping returns for the port of London for February 1926, 
show a reduction in the aggregate of tonnage entered and an 
‘increase in tonnage cleared when compared with the corresponding 
month a year ago. They also reveal a retrogression in the position 
of British as compared with foreign shipping. Total tonnage 
entered amounted to 3,805,000 tons as against 3,826,000 tons the 
British share of which fell from 2,594,000 tons to 2,568,000 tons. 
‘Tonnage cleared in February 1926 was about 32,000 tons higher 
at 4,796.000 tons but the share of British tonnage fell from 
3,091,000 tons to 3,076,000 tons. 

(M.A., Calcutta) 


15. Draw up in detail, with proper attention to spacing, 
double lines ete., showing all sub-totals, a blank table in which could 
be entered the number of persons occupied in six industries, on two 
different dates, distinguishing males from females and among the 
latter, singles, married and widowed. (B. Com. Gujrat) 


16. Present in a tabular form with suitable captions etc. 
the information contained in following : 


“In 1945, out of a total of 1750 workers of a factory 1200 
workers were members of a trade union. The number of women 
‘employed was 200 of which 175 did not belong to a trade union. 
In 1950, the number of union workers increased to 1580 of which 
1290 were men. On the other hand the number of non-union 
-workers fell down to 208 of which 180 were men. In 1955, there 
were on the Pay rolls of the factory, 1800 employees who belonged 
to a trade union and 50 who did not belong to a trade union. 
Of all the employees in 1955, 300 were women of whom only 
8 did not belong to a trade union. (B. Com. Bombay) 


СНАРТЕЕ—6 


DIAGRAMS 


“Cold figures are uninspiring to most People. Diagrams 
help us to see the pattern and shape of any complex situation. 
Just as a map gives us a birds' eye view of wide stretch of 
country, so diagrams help us to visualize the whole meaning of 
а numerical complex at a single glance. Give me ап undigested 
heap of figures and I cannot see the wood for the trees. Give 
me a diagram and I am positively encouraged to forget detail 
until I have real grasp of the overall picture. Diagrams register 
a meaningful impression almost before we think." 


MORONEY 


Figures, at best are not easy things for the mind to grasp 
and hold long enough for purposes of comparison. When read 
to an audience, they become practically meaningless. In a book, 
their essential indications can only be ascertained by careful 
scrutiny. One of the chief aims of statistical science is to 
render the meaning of masses of figures clear and comprehensible 
ata glance. То attain this end, many devices have been invented 
to supplement or explain the table, of these diagrammatic 
representation is the most common and popular. Now 
diagrams have become important tools in the hands of 
statisticians to portray statistical data. The Credit of introduc- 
tion of diagrams into Statistical Science goes to William 
Playfor who used them in 1786. 


The purpose of tabulation is to arrange masses of unwieldy 
data into a logical and precise way. The object of diagrammatic 
representation is to illustrate statistical facts. Diagrams are 
necessary for explanation and exposition of statistical data. 
However clearly and concisely the tables might have been 
compiled, it requires a certain amount of time and an expert’s 
eye to know the salient features of silent figures. Figures are 
always dull and confusing, specially when they are complicated 
and large. Therefore at times, it becomes nesessary to adopt 
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some other device which may present the uninteresting 
numerical data in a way that is atonce comparable and appealing 
both to eye and intelligence. To put it into the words of 
Dr. Bowley, “any list of figures becomes less comprehensible as 
its length increases. A series of ten numbers can perhaps be 
easily grasped, of seventy only with an effort, while a printed 
list of figures for one hundred successive years leaves hardly any 
impression on our mind at all ; we can not see wood for the 
trees.” 


The method of visual aids comprising of presenting 
statistical material with the help of pictures, geometrical figures 
etc. has been devised to serve this purpose. А properly 
constructed diagram appeals to the eye and also to the mind 
because it is practical, clear and easily understandable even by 
those unacquainted with the methods of presentation. 


Utility of Diagrams. Though diagrams do not add any new 
meaning to the statistical facts but exhibit the results more: 
clearly. They are helpful in understanding the facts specially 
to those who have no statistical training. Seanning the figures 
from the tables becomes very tiresome to the eye and is often 
confuses an average person. Diagrammatic representation of 
statistical facts is the best way of appealing to the mind through 
the eye. For example, a few well-designed but simple diagrams 
showing the trend of sales and costs will be definitely more 
eloquent to a businessman than a mass of detailed monthly 
figures. Even the statistician himself will employ diagrams to 
ascertain the pattern or distribution of his data because the 
character of the distribution will sometimes determine the type 
of statistical analysis he will employ. Diagrams are useful 
because facts can be known at a glance. They are readily 
intelligible and save a considerable amount of time and energy. 
Diagrams are more attractive than the figures. Figures when 
told through diagrams become attractive. In other words, pages 
of cold figures вап be made to-speak in clear tones if translated 
into the diagrammatic language. Not only this, diagrams are 
also important means of detecting mistakes in statistical 
computations. They suggest the lines for further investigations. 
Thus we see that diagrams occupy an imporant place in statistical 
methods, because, 

(1) They are attractive and create lasting impression. A 
person who does not like to devote even a single minute to the- 
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Study of page containing numerical tables, in most cases would 
not like to take his eyes away from a picture relating even to 
the same topic. 


(2) Diagrams have the merit of rendering the whole data 
readily intelligible. 


(3) Diagrams make comparison possible. 


(4) They save lot of time which would have otherwise been 
‘lost in grasping the significance of numerical data. 


(5) They make unweildy data, requiring a number of pages 
to write, visible at glance. 


Limitations of Diagrams. Diagrams аге important 
‘statistical tools no doubt but they are not to replace classification 
and tabulation. They are rather complimentary to those 
processes. Diagrams present the facts as they are without 
‘adding new meaning to them. Diagrammatic representation is 
risky in the hands of those who draw inferences from them 
‘without making a careful study. These delicate tools are liable 
to be easily misinterpreted and consequently may be used for 
grinding one’s axe during propaganda or advertisement. Diagra- 
mmatie representation has three serious limitations (a) They 
‘can vividly show only limited amount of information. Statistical 
tables can be divided into number of columns and rows so as to 
depict many sets of facts (b) they can show only approximate 
values and (с) a diagram is limited to the Portrayal of two or 
three aspects of a set of data, otherwise it becomes too complex 
to be understood. 

General Rules for Diagrammatic Presentation, The 
‘diagrammatic representation of the statistical facts will be 
advantageous provided following rules are observed in drawing 
‘diagrams :— 


1—Heading :—Every diagram must have a suitable but 
‘short heading. In cases where the matter is likely to be 
misunderstood a short sub-heading can also be given, 


2—Size :—The size of the diagram should be such as will 
allow its significant features to be clearly perceived, A 


"diagram should not appear elumsy and should not be too small 
or too big. 


a ee 


счет 
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3—9сще :—In determining the scale of a diagram it should 
be borne in mind that the figures drawn should be in a position 
to show clearly the necessary details. 


4—Drawing :—As a diagram is required to be impressive, 
it should be neatly as well as accurately drawn with the help 
of drawing instruments. The scale adopted must be strictly 
and rigidly followed. Colouring, dotting, crossing or cross- 
hatching should be done either in pencil or in colour, but it 
should be free from blots. 


5—Jndex :—When many items are shown in a diagram, 
different colours dotting or crossing etc. may be used. Hence 
an index must be given for identification. 


6—Presentation :—The original data on which diagram 
has been based should be given, if possible, facing the diagram. 
This will help the observer to see the details with accuracy 
and there will remain no possibility of any misunderstanding. 


7—Economy :—The test of a good diagram depends upon 
the speed and ease with which the observer can interpret it. 
Economy in cost and energy should be exercised in drawing it. 


Essential requisites of a good diagram. In drawing a good 
diagram it is necessary to bear in mind its artistic as well as 
scientific aspect. A diagram designed with great care but 
which is not properly drawn and finished will fail to command 
the necessary attention with the result that the time and 
energy will be wasted. On the contrary, the diagram drawn 
quite artistically but not correctly designed will receive undue 
attention and may even sometimes give wrong impression 
which is not desirable. Diagrams must be neat and absolutely 
to the scale, It must have at tae same time a good visual effect 
because if the diagrams are too complicated or if they are 
poorly designed they will fail to create desired effect. The 
design of the diagram should be simple and it should not 
involve too much labour and yet should bring out comparisons 
whenever possible. Lastly the cost, time and labour involved 
should always be proportionate to the utility of the diagram. 
The guiding principle should be the maximum utility with the 
minimum labour and cost. 


6 
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Types of Diagrams 


There is а large number of diagrammatic forms to choose 
from. The choice of the types of diagrams in which the data 
are to be presented is a difficult one. Selection of the type 
will depend upon ability and experience. The type selected 
should be such as to represent the data in a clear, concise and 
direct manner. The following are the common types of 
diagrams :— 


1—One-dimensional Diagram. 

2—Two-dimensional Diagram. 
8—Three-dimensional Diagram. 
4—Pictograms. 

5—Cartograms. 


1—One Dimensional Diagram :—Bars are one dimensional 
diagrams. They are simple to draw and easy to understand; 
Bar diagram is called one dimensional because their height 
represents the size of the figure but width is not considered, 
which is shown merely for attraction. Though all bars drawn 
must be of equal width. Bar diagrams may be of the following 
types :— 


(a) Simple Bar Diagram. 
(6) Multiple Bar Diagram. 
(с) Sub-divided Bar Diagram. 
(4) Profit and Loss Diagram. 


(e) Sub-divided Bar Diagram drawn on the percentage 
basis. 


(a) Simple Bar Diagram—Simple bars can be drawn 
either on horizontal or vertical base, but bars on horizontal 
base are more common. Comparison is facilitated in bars. When 
the number of items is large lines may be drawn instead of 
bars. Bars must be of uniform width and intervening space 
between bars must be equal. 


Illustration—1 : $ 


Represent the following data diagramatically :— 
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Loans upto December 1980 from Foreign Countries 


In crore rupees 

1. «ВА? n oe 974° 64 
2.. 0.5.5.8. te, xs 883 41 
8. О.К. .. .. 162'66 
4. West Germany  .. T 150°58 
5. „ДВ. В.О. „Я Ке 219'44 
6. Japan 25 vs 27'61 
7. From other countries vs 173'23 

2091'57 


(If it is not a time series, Bars should be in ascending 
orders). 


FOREIGN LOANS TO INDIA 


a UPTO 31st DEC. 1960 


150 


100 


en 
ea 


(In сгогез Rs) 
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Illustration—2 


Draw a suitable diagram to represent the following data :— 


Year Sales Rs. 
1953 i АР 10,000 
1954 "4 - 12,000 
1955 = АЯ 15,000 
1956 Ac T 14,000 
1957 2s E 17,000 
1958 “в ae 16,000 
1959 oe oe 18,000 
1960 с .. 20,000 
SALES IN 1953.60 
20 

81: 

£10 | 

= 

з 5 

л 0 


1953 1954 1955 1956 1957 1958 1959 1960 
Illustration—3 


The following figures show the Progress of Civil Aviation 
in India from 1947 to 1956 :— 


O 


Year Miles Flown 
(in 000) 

1947 P: 3: 9,362 : 
1948 ii .. 12,649 
1949 7 .. 15098 

1950 jj .. 18,896 

1951 Ё .. 19,498 

1959 Ў .. 19,562 

1958 bs .. 19,802 

1954 E .. 19,798 

1955 = .. 91,266 

1956 i .. 28,418 


Represent the above data by а Horizontal Bar Diagram. 
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PROGRESS OF CIVIL AVIATION 
MILES FLOWN IN 000 
5,000 10,000 15.000 


1947 
1948 
1949 
1950 
1951 
1952 
1953 
1954 
1955 
1956 


Multiple Bar Diagram. Such diagrams are used when 
comparison is to be made between two or more phenomena over 
a number of years. In order to distinguish bars, they may 
be either differently coloured or there should be different 
crossings or dottings ete. Such diagrams are also called 
compound bar diagrams. 


Illust*ation—4 


Represent the following data by a suitable diagram. 


1954 1955 1956 1957 
(In Rs. abja 100 crores) 


1. Agriculture 49°8 508 502  52'6 
2. Com. & Industry 84'8 36'1 87°3 89'2 
8. Others 15'7 164 178 183 


Total 100'3 10278  104'8 110°1 
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(Ав abja) 


Sub-Divided Bar Diagram. In a sub-divided bar diagram 
the bar is sub-divided into various parts in proportion to the 
values given in the data and the whole bar represents the total, 
The sub-divisions are distinguished by different colours. The 
main defect of such a diagram is that all the parts do not have 
а common base to enable us to compare accurately the various 
components of the data. 


Illustration—5 


' The following table gives an analysis of bank advances as 
on 30th June 1953 according to their purpose :— 


Bank Advance to— Scheduled Banks Non Scheduled Total 


Banks 
(In crores rupees) 

Commerce 188'98 6'52 195' 50 

Industry 262'65 18'08 280' 78 

Agriculture 22°99 2°12 25°11 
Personal and 

Professional 41°00 11°28 52°28 

All others 29°25 2°95 82°20 

544° 87 40°95 585° 82 


— 


Represent the above data by sub-divided Bars. 
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BANK ADVANCES — 
BY 


Non-Scheduled Banks ЕШ 
Scheduled Banks tm 1 


300 


250 


(In crores of Rs) 
ке 
S 
INDUSTRY 


© 
COMMERCE 


Bar Diagram showing Differences. Sub-divided bar 
diagrams can also be used to Show the difference of figures 
like, imports and exports, Death rate and Birth rate etc. In 
such a case a bar is drawn representing one fact, which is 
greater, and the value of other fact will be cut from it. Both 
facts are represented in different colours. 


Illustration—6 


Present the following figures relating to Birth and Death 
rates in certain selected countries by sub-divided bars :— 


Country Birth Rate Death Rate 
India Us 81 18 
Canada к 28 8 
Ceylon Е 88 11 
Australia es 28 9 
France es 19 12 
Japan ks 19 8 
Sweden Л 15 9 
U.K. sis 15 12 


U.S.A. vs 25 9 
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BIRTH RATE AND DEATH RATE 


CEYLON 


CANADA 


(Total shows birth rate, the lower position shows death 
rate and the upper portion shows survival rate) 


In the above illustration birth rate was greater in all the 
countries so it was easy to draw bars of birth rate and cut 
for death rate. In cases like imports and exports where imports 
as well as exports can be greater the bar will be drawn according 
to the item which is greater and the value of other will be cut, 
the balance will appear as a difference or balance of trade. 


Illustration—7 


Show the following data by a suitable diagram :— 


(Rs. crores) 

Year Imports Exports Balance of trade 
1952 454'4 2750 —179'4 
1953 329°3 291'6 — 87'7 
1954 38072 3514 + 21°2 
1955 376'7 397°5 + 20'8 
1956 281'8 306'2 + 24°4 
1957 3023 2912 — M1 

‘ 1958 857° 4 `` 889'7 — 177 

| 1959 3572 `` 3812 — 26'0 
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INDIA'S BALANCE OF PAYMENTS WITH 
STERLING AREA ON CURRENT ACCOUNT 


500 
Import 
Export 
400 
$ 300 
з 
= 
5 200 
Ё 
2 
o 
100 


1952 1953 1954 1955 1956 1957 1958 1959 


Another sub-divided bar diagram. 


Illustration—8 


With the help of the following data regarding the Indian 
National Income between 1950-51 and 1953-54 draw а suitable 


diagram :— 
National Income in Crores of Rupees. 
Source 1953-54 1952-53 1951-52 1950-51 
Agriculture 5,400 4,790 — 4,990 4,890 
Mining, Manufacturing and - 
handicrafts D 1,800 1,760 1,730 1,530 


Commerce, transport and 
communication 
Other services 


1,00 1,780 1,790 1,690 
1,610 1440 1,500 1440 
XT ÁO HERR EL у BEEN 
10,010 9,870 10,010 9,540 
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Б NATIONAL INCOME 
(In ише Rs) 1950.51 to 1953-54 
10,000 


:8,000 


| 


| 
| 


5, 


44 
Index 
2 Agri. [^ ] 
Mining ete [zz] 
Com. ТЇ 
Others SUMI 
0 1950-51 1951 52 1952-53 1953-54 
Year 


Deviation Bars. Such bars are drawn to present net 
quantities excess or deficit e.g. balance of trade net profit ete. 
The positive and negative values are shown above and below 
the X-axis respectively. 


MMustration—9 р 
Trade with Pakistan. 
(In million rupees) 

Year Exports Imports Balance 

Jm = 
1954 24 9 12 — 
1955 115 92 28 — 
1956 84 92 — 8 
1957 110 120 — 10 
1958 186 188 — 47 


1959 162 187 == 25 
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+40 


BALANCE OF TRADE WITH 
PAKISTAN 1954-1959 
+20 
° 1956 1957 1958 1959 
1954 
—20 
-40 


Floating Bars. Floating bars may be drawn to show an 
aggregate composed of two parts. One part can be shown below 
the base and the other above it. Both parts added together 
give total value. 


Illustration —10 


Show the following data by means of floating bars :— 


Year Gross Income Manufacturing Net Income of a Shoe 


Expenses Manufacturer of Agra 


In (Rs. 000) 
1956 12 b 7 
1957 18 5 8 
1958 16 6 10 
1959 28 8 15 


1960 27 12 15 
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NET INCOME 


15 MFG. EXPENSES 


Paired Bars. Paired bars may be drawn to compare two 
quantities respecting a number of items. 


Illustration —11 


Represent the following data by a suitable diagram :— 
Area and production of rice in some countries of Europe. 


Area Production 
Country, (1000 hectares) (1000 metric tons) 
Italy 168 818 
Spain ‚ 64 325 
Portugal 33 149 
Greece 22 75 
Егапсе 20 80 
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Paired bars showing area and production of rice im some 
countries of Europe. 


AREA к PRODUCTION 
THOUSAND HECTARES THOUSAND METRIC TONS 
пля КА 
san АВА 
PORTUGAL К 
GREECE 
RICE FRANCE 


AREA AND PRODUCTION 
Line and Bar Diagram. Such diagram is used to compare 
similar data for two or more years. 
Illustration—12 
Draw a suitable задал to show the following data :— 
Number of students on roll in a college in 1952 and 1953. 


Classes | Number in 1958 | Number in 1960 
B.A. - SEPT 150 200 
B.Com. аа WM 100 125 
B.Sc. C Ys 80 100 
M.A. << КРӨ 70 60 
M.Com, .. «- 40 50 
Ms. eee 30 40 


NUMBER OF STUDENTS ON ROLLS 
19586 1960 


1958 


iJ 196 
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Population-Pyramid. Bars are so drawn to show popula- 
tion data as to look like a Pyramid. 


Illustration —18 
Show the following data diagrammatically :— 
Population of а town according to age-group, sex, and illiteracy. 


No. illiterate 


No. illiterate 1 

Agngroup NX | in males | рез in females 
Below 20 10,000 1,000 9,600 | 1,500 

20—40 8,000 1,000 7,500 | 2000 

40—60 6,000 2,000 5,700 1,500 

60—80 4,000 1,000 8,000 | 500 
Above 80 2,000 500 1,500 500 

т POPULATION PYRAMID 

Illiterate 

Above 80 

60-80 

40-60 

20-40 

Under 20 


Ор 8-767 54 2203072 47176 8 10 12 
"Thousands of Persons 
Males Females 


Progress Chart-Gantt Chart. Such chart is used in time 
and motion studies to compare the actual work done with the 
allotted quota. Actual work done is expressed as a percentage 
of the alloted quota. 

Illustration—14 


Draw a Gantt's chart from the following data :— 


Percent of the work done in the manufacturing sections 
as against the allotted quota. 


Sections | Monday| Tues. Thurs. 

- & | 80 | вы | гы agg 
В 60 | 60 | 75 | 80 |100 | 100 475 
C | so | во | т | 1 | во 440 
D | 75 | so | 1001) 10074 300 140 595 


Allotted Quota for each week day 100%. 
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Gantt chart showing the percent of the work done during 
the week (June 15-20) as against the allotted quota. 


June 15]June 16]June 17|June 18] Лше 19|] 20 
Thar [Psy 
A — sd с 


The heavy line shows the work done during the week. 


Broken Bars. When lengths of bars to be drawn are of 
unusual sizes, they may be broken at a suitable point. Such 
bars must be read very carefully. Such bars are shown as 
illustrated below :— 


240] 240 
220 220 
| 200 200 
180 180 


Cu i M ш 


Sub-divided Bar Diagram drawn on Percentage basis. 
So far we have studied bars depicting absolute changes. For 
comparison purpose relative changes may be studied with the 
help of percentage diagram. АП components are changed into 
percentages on the basis of their total taken as 100. For 
dividing the bars, these percentages are cumulated. 
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IUustration—15 


The following table gives the distribution of outlay in the 
two Five-Year Plans of India under major heads of develop- 


ment expenditure :— 


First Plan Second Plan 


Heaps or EXPENDITURE 

(a) Agriculture and s poa 
Development ү 

(b) Irrigation and Power 

(c) Industry and Mining 

(d) Transport and ттн 

(e) Social Services 

(f) Miscellaneous 


Total 


(in crores of Rupees) 


857 568 
661 913 
179 890 
557 1,385 
588 945 
69 99 
2,356 4,800 


Represent the above information by a suitable diagram. 
(B. Com. BH. U. Vikram) 


PLAN OUTLAY AND ALLOCATIONS 


Finsr Pian 


Heaps ОЕ 
EXPENDITURE 


100%, 

(a) Agriculture and 

Community De- 
velopment ..| 857] 15.1 | 15.1 

(b) Irrigation арӣ 
Power - 661| 28.1 | 43.2 

(c) Industry and 
Mining ..| 179| 7.6 | 50.8 

(d) Transport апа 
Communication 557 | 23.6 | 74.4 
(е) Social Services 533 22.6 | 97.0 
(f) Miscellaneous 69| 8.0 |100.0 
Total ..| 2,356 | 100.0 | — 


pe —— 


Rs. 2,356 | Crores— 


. |Сготез | о Cum. 
of Rs. % 


Rs. 4,800 Crores=_ 


SEgcoNp Pian 


% 


100% 


890| 18.5 | 49.8 


1,385 | 28.9 | 78.2 
945 | 19.7 | 97.9 
99| 2.1 | 100.0 


4,800 | 100.0 | — 


үү" Mri pm 


DIAGRAMS 


THE TWO PLANS OF INDIA'S 
ECONOMIC DEVELOPMENT 


Percent 
= 
> 
E 
© 80 
: | 
I. 
#5 60 
Bi 
E -. 
; Ыт 
„ АБ Е 
E]. 20 Fe cate 
[:. E | В [] 
ал ос ыыы И ЕЕ 


First Plan Second Plan 


Miscellaneous 


Transport 
& Com. 


Profit and Loss or Bilateral Digram. If the data regarding 
the cost of production of a particular commodity and its 
sale price is given for a number of years, sub-divided percentage 
bars can be drawn to show the percentage of each element of 
cost and profit and loss. In such a case sale price is taken as 


equal to 100. 
Illustration—16 


Represent the following by sub-divided Bar Diagram 
drawn on the percentage basis as well as showing absolute 


figures :— 

Particulars 1958 1959 
Proceeds per pair of shoes 10°00 9°50 
Cost :—(a) Material 3°00 3°00 

| (b) Wages 2°00 2°50 

| (c) Other Expenses 1°00 1°50 • 
| (d) Finishing 2°00 2°50 
| 


Тоїа1 8°00 9°50 


Profit or Loss +2 == 


re 


1960 
10°00 
4°50 
3°00 
2°00 
2°50 


12°00 


—2 
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DIAGRAMS 


(a) Relative 


DIAGRAM SHOWING COST EXPENSES 


AND 
PROFIT OR LOSS 
1958 1959 1960 


Material ШЩ 
Wages E) 
› Expenses Ш 
Finishing 


-20 
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(b) Absolute 


2—Two-dimensional Diagram. In one dimensional 
diagrams, only length is taken into account. But in two-dimen- 
sional diagrams the area represents the data and so the length 
and breadth have both to be taken into account. Such diagrams 
are also called area diagrams or surface diagrams. The most 
common forms of area diagrams are :— 


1—Rectangles 
2—Squares 
8—Circles or Pie diagrams 


Rectangles, The area of a rectangle is to represent the 
total units, thus their breadth is fixed in proportion to the size 
of the data. The rectangles may be sub-divided to represent 
components. 


DIAGRAMS 101 


Illustration—17 


Represent the following data by a two dimensional diagram : 


Items of Expenditure Family A Family B 
(Income Rs. 500) (Income Rs. 800) 
l. Food bs 4 200 250 
2. Clothing =. as 100 200 
3. House Rent oe 80 100 
4. Fuel and lighting | es 40 50 
5. Miscellaneous 
(Including saving) 80 200 
Total 500 800 


(B. Com. B. H. U.) 


The figures are to be converted into percentages as given 
below :— 


Family A Family B 

Items of Expenditure Rs. 96 Cum. % (Rs, 800—100) 

(Rs. 500—100) Ез, % Cum. % 
1. Food 200 40.00 40.00 | 250 31.25 31.25 
2. Clothing 100 20.00 60.00 25.00 56.25 
3. House Rent 80 16.00 76.00 || 100 12.50 68.75 
4. Fuel & lighting 40 8.00 84.00 | 50 6.25 75.00 
5. Miscellaneous 80 16.00 10000 '! 200 25.00 100.00 


Total 500 ^ 100.00 800 100.00 
MONTHLY EXPENDITURE 


Per cent 
Per cent 


Family A Family B 
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Illustration —18 
Represent the following data by means of a suitable two 


dimensional diagram :— 


A B 
Price of commodity Rs. 2 per unit Rs. 3 per unit 
Quantity sold 40 units 20 units 
Value of Raw material Rs. 26 Rs. 24 
Other Expenses of 
production Rs. 32 Rs. 21 
Profits Rs. 22 


Rs. 15 
(B. Com. В.Н. U.) 
The following calculations will have to be made :— 


Commodity А | Commodity В 
40 units 20 units 
Cost of Production = = 
Total Е Unit| Total | Per Unit 
Rs. | Rs. Rs. Rs. 
Value of Raw Materials 26 | 0.65 24 | 1.20 
Other Expenses of | | 
Production m 32 | 0.85 21 1.05 
Profits хе ss 22 | 0.55 | 15 | 0.75 
Total | 80 | 2.00 60 | 3.00 


COST, EXPENSES AND PROFITS OF 
PRODUCT A AND B 


Price Per Unit in Rupees 
Price Per Unit ot Rupees 


40 Units 20 Units 
Quantity Sold 
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Squares. When the values of items to be represented 
diagrammatically have a wide range it is difficult to represent 
them by means of rectangles, if the scale taken is a small one, 
the heights of some of the rectangles will be disproportionately 
large or if otherwise the items representing a small value will 
be too small. To get over this difficulty squares are drawn. 


To draw square diagram, square root is taken of the 
values of the various items to be shown in a diagram and then 
by taking a convenient scale, squares are drawn. 
Illustration—19 

The following figures relate to the value of sugar 
manufaetured in the various states of the union :— 


States Value in Rupees 
Uttar Pradesh D. ..  47,55,78,000 
Bihar 3 x A 18,18,18,000 
Madras - у $ 7,57,76,000 
Bombay *- айа E 4,01,01,000 
Other States oe .. 1,97,17,000 
Total 79,29,85,000 


Represent the above data by means of both(a) circles and 
(b) squares to bring out their relative importance in the 
production of sugar. (M. A. Agra). 
States Value in crore Rs. Square roots Side of the 


square or radio of 
the circle in inches 


U.P. 48 6.98 1.4 
Bihar 18 4.24 .8 
Madras 8 2.83 6 
Bombay 4 2.00 E] 
Other States 2 1.41 8 


1 sq inch = 25 crores of Rs 


UP. BIHAR MADRAS BOMBAY OTHERS 
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It is necessary to calculate the scale. The area of a square 
is equal to the square of its one side. Thus the area of a 
square, say, of Bombay is .4"? — .16inch which represents 
4 crore rupees than 1 square inch will represent 25 crore of 
rupees. 


Circles. Circular diagrams or pie diagrams are very useful 
in emphasising areas. They are comparatively easier to draw. 
With circles and sectors, total as well as component parts can 
be exhibited. Circles can be drawn by making their areas 
proportionate to the magnitude in question. In such diagrams 
however it is difficult to judge with precision the relative areas 
of circles. In the above illustration pie diagrams will be drawn 
as illustrated below. The side of square will become radius of 
the respective circles. 


BIHAR 
UP. 
22 
The area of the circle is z1?, т(ріе) = ——. The area 
7 
22 
of a circle with radius of 4” will be — — Х.16 — .502656” 
7. 


square inch. Hence 1sq” will represent nearly 8 crores of rupees, 


Angular Diagrams. Just as we divide a bar or a rectangle 
to show its components, a circle can also be divided into sectors, 
As there are 360 degrees at the centre so proportionate sectors 
are cut taking the whole data equal to 360 degrees. 
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Illustration—20 
Following are the figures of the population of the various 
countries of the world and the total world population in 1931 :— 


Country Population Country Population 
(000, omitted) (000, omitted), 
China .. 411,770 Japan * 64,770 
India .. 859,870 | USK. в: 46,077 
U.S.S.R. .. 1,601,000 France e 41,860 
U.S.A. .. 124070 Italy Pi 41,100 
Germany © 64,776 Others ‚.. 005,077 
World 20,12,800 
Population Degrees Population Degrees 
000,00, omitted 000,00, omitted 
China 4118 74.0 Japan 648 12.0 
India 3524 63.0 U.K. 461 8.0 
U.S.S.R. 1610 29.0 France 419 7.0 
U.S.A. 1241 22.0 Italy 411 7.0 
Germany 648 12.0 Others 7050 126.0 
20130 360.0 


"World Population in 1931. 
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Illustration—21 


Show the following data by angular diagram :— 


1950 1960 


Amount| Degrees |Amount| Degrees 
Rs. Rs. 


Sale proceeds. i 
per cycle 100 360° 150 860 
Cost per cycle : 
wages 30 108° 90 | 216° 
material 40 144° 40 96*12? 
other costs 20 yes 30 72° 
Total cost 90 324° 160 384/12? 
Profit (+) or loss (—) 
er cycle + 10| + 36? — 10 | — 24*12? 


P AN 
96:12? 


Given below are the data relating to the expenditure of 
three families :— 


Illustration—22 


ао A": NA 
ltems of Expenditure Family 4 Family B Family C 
Rs, nP. Рз. пр, Rs. ПР. 
Food vs eft 12.00 80.00 90.00 
‘Clothing a $: 2.00 7.00 35.00 
Rent A .. 2.00 8.00 40.00 
Education De n 1.50 3.00 12.00 
Litigation an vs 1.00 5.00 40.00 
Conventional Necessaries .. 0.50 3.00 60.00 
Miscellaneous .. T. 1.00 4.00 23.00 


c DM REEL. 
Represent the above data by an Angular Diagram. 


(M. Com. Agra). 
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Family А Family B Family C 
M гии 
Degrees| Rs. |Degrees| Ез. |Degrees 
T л 
Food 12.00 | 216° |30.00| 180° 90.00 | 108° 
Clothing 2.00 86° | 7.00 42° | 35.00 42° 
Rent 2.00 86 8.00 48? | 40.00 48 
Education 1.50 27 3.00 18? | 12,00 14 
Litigation 1.00 18 5.00 80° 40.00 48 
Conv. Necessaries 0.50 9 3.00 18° | 60.00 72 
Miscellaneous | 1.00 18 4.00 24? | 23.00 28 
Total |20.00 | 360° | 60.00 | 360° |800.00 360° 
Square Root 4.47 7.74 17.82 
Radii of Circles 0.22” 0.39" 0.87” 


DIAGRAM SHOWING MONTHLY EXPENDITURE OF 


Index 


Г] Food 
Clothing 
Rent 
Education 
BE Litigation 
ШШ Con Nec 
Misce 


Family A 


3—Three-Dimensional 


Family B 


THREE FAMILIES 


Diagrams. 


Family С 


Three 


dimensional 


diagrams or volume diagrams comprise of cubes, spheres, 


prisms, cylinders and blocks. 


Of these cubes are most common. 


Side of a cube is drawn in proportion tc the cube root of the 
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magnitude of the data. Such diagrams are useful when the 
data has a very wide range. А cube is drawn as shown below :— 


Illustration—238 


Show the following data by a three dimensional diagram :— 


Area under Cocoanuts in 1959-60 


State Acres | Reduced figures Cube Side of 
1 2 8 roots a Cube 
2-29 
Madras 596147 20557 27.40 2.74 
Bombay 28547 984 9.95 1.00 
Orissa 28425 980 9.93 0.99 
W. Bengal 12700 438 7.59 0.76 
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AREA UNDER COCOANUTS IN 1959-60 


MADRAS 


BEDE 


BOMBAY W. BENGAL 


4Pictograms. The technique of pictograms was developed 
фу Dr. Otto Neurath who was a resident of Vienna. This 
is the reason for calling this method as Vienna method. 
This method is also known as Isotype method. 


Pictograms are very useful in attracting attention as well 
as for showing comparisons. They are easily understood and 
‘creates a lasting impression on the mind. 

5—Cartogram or Mapograph. Statistical maps are also 
‘used to represent data like density of population etc. Mapographs 
are superior to Pictograms because of the spatial element. On 
a map data may be shown either (a) by points, dots or crosses 
юг (b) by writing the actual figure ог (c) by deepening the 
‘colour in proportion to the magnitude. 
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Suggestions to be followed in the use of statistical 
diagrams : 

Secrist* makes the following suggestions :— 

1. Choose illustrations which аге least liable to be 
misunderstood, and which most faithfully and correctly interpret 
the facts. 

2. See that fact and representation agree and that all 
diagrams аге provided with concise, clearly stated апа 
appropriate titles. 


3. Avoid figures which must be read in more than one 
dimension. 


4. Indicate on diagrams the scales of values used, and 
where necessary to avoid confusion, the dimension or dimensions 
which are significant in interpretation. 


5. Include as a component or as an accompanying part of 
diagrams the concrete data which they illustrate. 


6. In expressing the different parts of a total, use lines or 
bars rather than sectors of circles. 


7. In statistical maps representing a series, divide the 
frequencies and not the number of distriets or divisions into 


equal parts. 


8. In statistical maps representing a series, incorporate as 
a part of the legend the frequency with which the units of 
measurement occur, thus indicating the distribution by map and 
by legend. 


To give a decent shape bars should be arranged in 
numerical, alphabetical, progressive or historical order. 


Theoretical Questions 


1. What are the different types of diagrams. Which are 
used in Statistics to show the salient characteristics of groups and 
series ? Illustrate your answer. (B. Com. Saugar) 


2. Point out the usefulness of diagrammatic representation 
of facts and explain the construction of any one of the different 
forms of diagrams you know. (B. Com. Alld.) 


+ Secrist—Introduction to Statistical Methods. 


112 AN INTRODUCTION TO MODERN STATISTICS 


3. Describe the importance and drawbacks of Diagram in 
summarising statistical raw data. (B. Com. Bombay) 


4. Write a short note on the utility of Diagrams in business 
and commerce. (B. Com. Gujrat) 


Practical Questions 


1. Explain the importance of diagrammatic representation 
of statistical data, and represent the following figures by an 
appropriate diagram :— 


Approximate monthly income in rupees of a few classes of workers 


in U.P. 
Rs. Rs 
Artisan али, EOS .. 2,000.0 
Clerk "o 20.0 Labourer rt 10.0 
Greengrocer "s 40.0 Peon = 12.5 
Gumasta — s 30.0 Pleader ws 150.0 
Cultivator - 5.0 School Teacher .- 30.0 
Doctor .. 950.0 University Teacher 300.0 


(В. Com., B.H.U.) 


2. The following figures show the extent of the regrouped 
railways after 14th April, 1960 :— 


Railway Route Mileage 
(1) Southern 6 6,017 
(2) Central го ADS 
(3) Western Ws 5,631 
(4) Northern oe 6,007 
(5) North Eastern ur 4,787 
(6) Eastern o 5,667 


Represent the above data by a Bar diagram. 


3. Utilize the following data to present diagrammatically 
the relative increase in note-circulation towards the end of 1945 
in the different countries :— 


Nores тм CIRCULATION 


(In millions of National Currency Unit) 


Country 1989 End of 1945 
Canada 25 288 1,129 
О. $. А 7,598 28,507 
U. К. “ч 555 1,380 
Australia КЖ 57 200 
India R 2,245 12,109 


(M. А. Ald.) 
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4. Show by suitable diagrams the absolute as well as relative 
changes in the student population of the Colleges 4 and В in the 
different departments from 1940 to 1947 :— 


DEPARTMENT CorLeGE A 
RE E R LA 
1940 1947 
Arts 300 350 
Science 120 500 
Commerce 200 650 
Law 100 300 


Сове B 
——— 
1940 1947 
100 200 
150 250 
130 150 
100 120 
(В. Com. Agra) 


3. Show by diagrams the distribution of Reserve Bank 
shares as given below :— 


Bombay 
Calcutta 
Delhi 
Madras 
Rangoon 


No. of shares in 


thousands 
1937 1938 
201 206 
125 123 
94 93 
60 59 
19 18 


No. of shareholders 
in thousands 


1957 1938 
215 208 
145 131 
157 149 
91 87 
18 16 


(B. Com. B. H. U.) 


6. Show by means of circular diagram the following ;— 


Centre's Clearing House Returns 
(Amount in erores of rupees) 


Calcutta 
Bombay 
Madras 

Other Centres 


1940 1945 

1070 2670 

829 2443 

108 274 

313 515 
(B. Com. Raj.) 


7. The following table gives the details of the cost of 
construction of a house in Allahabad :— 


Land 
Labour 
Bricks 
Tron 
Timber 


Rs. Rs. 
4,500 Cement 800 
2,500 Lime 800 
2,000 Stone 600 
1,800 Sand 200 
1,500 Other things 1,300 


Present the above figures by a'suitable diagram. 


(B. Com. Alld.) 
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8. Show the details of monthly expenditure of two families 
given below by means of two-dimensional diagrams :— 


Items of Family A | Family B 
Expenditure Income Rs. 500 p.m. Income Hs. 400 p.m. 
| Rs. | Rs, 
| 

Food 140 | 120 
Clothing 80 | 80 
House Rent 100 | 60 
Education 30 | 40 
Fuel & Lighting 40 | 20 
Miscellaneous 40 | 40 


Miscellaneous дд 
(М. А. Рип}аЬ) 


9. The following table gives the details of monthly 
expenditure of three families :— 


Items of Family X | Family Y | Family Z 

Expenditure Rs. Rs. Rs. 
Food 24 60 180 
Clothing 4 14 70 
House Rent 4 16 80 
Education 8 6 | 24 
Litigation 2 10 | 80 
Conventional needs 1 6 | 120 
Miscellaneous 2 3 46 


Represent the above figure by a suitable diagram. Which 
family is spending most wisely ? 

(M. Com. Alld.) 

10. The following table gives certain data in respect of 
coal production for two years :— 


1940 1950 
Rs. Rs. 
Proceeds per ten disposable 
Commercially A E 24 40 
Cost рег ton— 
Wages ЯС nk 16 26 
Other ‘costs Ts ДЕ 9 10 
Royalties Ж «x 1 1 
Profit (+) or Loss (—) 9 +3 


Draw a suitable diagram to represent these statistical facts. 
(B. Com. В.Н. U.) 
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11. Represent the following by Sub-divided Bars drawn on 
the percentage basis :— 


Cost, Proceeds Profit or Loss per chair during 1938, 1939 and 
1940 Cost per chair— 


1988 1989 1940 


(a) Wages in c 4.5 7.5 10.5 
(b) Other costs .. АА 8.0 5.1 7.0 
(c) Polishing A n 1.5 2.4 8.0 
Total Cost 9.0 15.0 21.0 
Proceeds per chair  .. 35 10.0 15.0 20.0 
Profit (--)/Loss (—) per chair (+)1.0 2 CoP a!) 
12. Draw suitable diagrams to represent the following :— 
Factory Wages Materials Other costs. Profit 

Rs. Rs. Rs. Rs. 

A а) 5,000 1,000 1,000 

В .. 9,000 3,000 800 500 


The number of units producted by A and В were 1,000 and 


700 respectively. 
Show also cost and profit per unit. (B. Com. B. H. U.) 


13. The production targets envisaged under the First Five 
Year Plan and the estimates of production in 1955-56 are given 
below in respect of a few industrial commodities. 


Commodity Target under Estimated Production 
First Plan for for 1955-56 
1955-56 *000' tons 
‘000° tons 
(a) Pig Iron (Capacity) 850 Nil 
(b) Finished Steel 
(Capacity) 100 35 
(c) Àmmonium Sulphate 315 826 
(d) Cement 
(U.P. Govt. Factory) 200 180 


Represent the above by a suitable diagram. 
(Dip in Bus. Man. Delhi) 


14. Illustrate the following data by a suitable diagram :— 
и 


II 

Rs. Rs. 
Price per unit of а commodity .. 10 12 
Quantity sold I 20 24 
Value of raw materials used 435 1,290 120 
Other expenses of production .. 60 96 
Profit US yt 40 72 


(M. А. Aligarh) 
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15. The following table gives the birth rates and death rates 
of a few countries in the world during the year 1931 :— 


Country Birth rate Death rate 
Egypt “a a 44 27 
Canada буз Ae 24 11 
U.S.A. zd we 19 12 
India 35 c 33 24 
Japan as ja 32 19 
Germany 2. a 16 11 
France .. ai 18 16 
Irish Free State a 20 14 
United Kingdom НАИЛЬ ч 12 
Soviet Russia nc 40 18 
Australia A Rc 20 9 
New Zealand .. an 18 8 
Palestine 5 m 58 En 
Sweden da An 15 12 
Norway on Я 17 11 


Represent the above figures by a suitable diagram. 
(B. Com. Lucknow) 
16. Indicate the diagrams you would consider most 
appropriate to use for representing each of the following classes 
of statistical data, stating briefly the reason for your choice :— 


(a) Distribution of a large number of candidates according 
to the number of marks scored by each at а public examination. 

(b) Marks scored by two selected candidates in each of six 
different subjects tested at an examination. 

(c) Total value of Indian Exports and Imports during the 
years 1988 to 1955. 

(d) Distribution of Assets of all Indian Life Assurance 
' Companies put together as at January 19, 1956. 

(e) Middle class cost of Living Index Numbers in Bombay 
and Calcutta during the years 1938 to 1955. 

(f) Distribution of age, sex and civil condition of persons 
enumerated at the census in 1951. (I. A. S.) 


17. 'The following is the table of crime figures (excluding 


petty cases) reported with the number of detections by the police 
Deptt. Govt. of Bombay for June 1952. 


Туре of Crime Nnmber Number 
Reported Detected 
Murder 25 29 
Decoity and Robbery 84 19 
House Breaking (by day) 114 85 
House Breaking (by night) 137 41 
Hust and stabbies 162 T1 


Total . 472 234 
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Prepare a diagram to exhibit the incidence of each type of 
crime and to compare the same with the measure of efficiency of 
the police department as depicted by the relative figures of 
detection. 

(B. Com. Bombay) 
: 18. Draw suitable bar diagrams to represent the following 
ata. 


Exports & Imports—India 
(Crores of Rupees) 


Year Imports Exports Difference 
1952 671 578 98 

1958 576 581 45 

1954 656.5 594 62.5 
1955 706.5 609.5 97.0 
1956 882.5 612.5 220.0 


(В. Com. Bombay) 


19. The distribution of outlay between different fields of 
education in the First and Second Five Year Plan is set out below. 
First Plan Second Plan 


(Rs. in crores) 


Elementary Education 93 89 
University Education 15 57 
Secondary Education 22 51 
Technical and Vocational Education 23 48 
Social Education 5 5 
Administration and Miscellaneous 11 57 

169 307 


Draw a suitable diagram to compare the total outlay 
as well as the outlay under individual heads. (Gujrat B. Com.) 


20. Represent the following data by means of an angular 


diagram : 
o —— 


Number of Persons employed in factories 


Year Men Women Children Total 
1950 180,000 110,000 70,000 360,000 
1960 350.000 210,000 160,000 720,000 


р ЖО ОШ ДЫн л ВИАН 
Я (Gujrat В. Com.) 


СНАРТЕЕ 7 


GRAPHS AND ECONOMIC CURVES 


“Illustrations including charts and graphs tend to simplify 
comparisons of statistical matter and trends." 


Dickson Hamrwrru 


А graph is a simple and effective way of illustrating and 
comprehending a table. It gives pictorial effect to what would 
otherwise be just a mass of figures. Drawing graphs is an art 
which ean be acquired only through practice. Statisticians have 
since long discovered the importance of graphic portrayal of their 
research findings. They employ this method of visual analysis to 
advantage. Graphic methods enable them to present quantitative 
data in a simple, clear and effective manner. 


The statistical data are so complex that it is very difficult for 
a common man to understand them. Though much of their 
complexity is reduced by classification and tabulation but still 
they are not easy to understand. Figures by their nature are 
uninteresting. If huge mass of data are depicted in the language 
of lines and curves, they become easy to understand and grasp. 
Their chief features atonce struck in the mind. According to 
Prof. Boddington, “The wandering of a line is more powerful 
in its effect on the mind than a tabulated statement ; it shows 
what is happening and what is likely to take place; just as 
quickly as the eye is capable of working.” The functions of 
graphs are—(1) To render complex data simple and easily 
understandable by giving a picturesque view. (2) To make 
comparisons easy by bringing the connected data near each other 
and placing their graphic representation side by side. 


Advantages of Graphic Representation. Graphic represen- 
tation of statistical data has the following advantages :— 
(1) Graphic representation is attractive, interesting and 


impressive. Graph is more attractive than a table of 
figures. One is likely to lost in a dry set of figures.. 
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By looking at a graph, one can study very easily the 
tendency and fluctuations of the data. 
(2) Data are visible at a glance. Figures even when they 
are presented in a table have to be studied thoroughly 
before any conclusion can be drawn. 
Graphic presentation of statistical data saves time and 
energy. This is probably the simplest method of 
presenting statistical data to such a advantage. 
(4) Comparisons can be made between two or more 
phenomena very easily. 
(5) Graphic presentation of statistical data is also helpful 


(3 


— 


in interpolation, extrapolation and forecasting. 

Due to these advantages, graphie presentation of statistical 
data is becoming more and more popular with the statisticians. 
Mr. Hubbard writes that, ^we may portray by simple graphic 
methods whole masses of intricate routine, the organization of an 
enterprize or the plan of a campaign. Graphs serve as storm 
signals for the Manager, Statesman, Engineer; as potent 
narratives for the Actuary, Statist, Naturalist and as forceful 
engines of research for sciences, Technology and Industry. 

Graphs are dynamic, dramatic. They may epitomise an 
epoch, each dot a fact, each slope an event, each curve a history. 
‘Wherever there are data to record, inferences to draw or facts 
to tell, graphs furnish the unrivalled means whose power we are 
just beginning to realise and apply.” 


Defects of Graphic Presentation 


1—Мапу people are not accustomed to it and they do not 
attach much importance to it. 

2—A curve simply shows tendency and fluctuations, actual 
values are not known. 

3— Graphic presentation may often give misleading 
impressions. Much depends upon the scale taken. Two different 
scale may show different fluctuations in the data. 

4—Accuracy is rather not possible in a graph. 

5—Graphs cannot be quoted in support of some statement. 

Structural Framework of Graphs. If two straight lines 
which interesect at right angles are drawn in a plane, it is 
possible to describe the location of any point in that plane, with 
reference to the point of intersection of the two lines. It is 
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customary to designate the vertical line as the y axis shown as 
y y' on the figure. The horizontal line x х” is referred to as the 
x—axis. The point at which the lines intersect is called the ‘origin’ 
and is assigned a value of zero. From this point a scale may be 
laid out on the x and y axis. x values to the right of the origin 
are given a positive sign to the left a negative sign. y values 
above the origin receive a positive sign below a negative sign. 

The two axes divide the plane into four sections known as 
"QUADRANTS". Both the x and y values are positive in 
Quadrant I. xs^ are negative and ys are positive in Quadrant II. 
In Quadrant III both x and y are negative. In Quadrant IV, x 
values are positive and y values are negative. 


A value plotted horizontally from zero on the x—axis is 
known as the 'abscissa' of the point while a value plotted vertically 
from zero on the y axis is called an ‘ordinate’. If we wish to 
plot a point abscissa --20 and ordinate+20 it will be located at 
point Р,. И the abscissa should be—20 and the ordinate-+-15. 
Point P, is located. Abscissas which are negative and paired 
with negative y values will be plotted in Quadrant III. Positive 
abscissas and negative ordinates will fall in Quadrant IV. 


| Quadrant I 


tx, У] 
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Most of the data of economies and business plot in 
Quadrant I. 


In drawing of graphs, the problem arises of the choice of a 
suitable scale. The choice should be so made as to accomodate 
the whole data. "Though no hard and fast rules can be laid down 
for the choice of scale, however Prof, A. L. Bowley's following 
words will guide in the suitable choice, "It is diffieult to lay down 
rules for the proper choice of scales by which the figures should 
be plotted out. It is only the ratio between the horizontal and 
vertical scales that need be considered. The figure must be 
sufficiently small for the whole of it to be visible at once ; if the 
figure is complicated, related to long series of years and varying 
numbers, minute accuracy must be sacrificed to this consideration. 
Supposing the horizontal scale is decided, the vertical scale must 
be chosen so that the part of the line which shows the greatest 
rate of increase is well inclined to the vertical which can be 
managed by making the scale sufficiently small; and on the 
other hand, all important fluctuations must be clearly visible for- 
which the scale may need to be increased. Any scale which 
satisfies both these conditions will fulfil its purpose.” Having 
decided on the scale, data may be plotted on the graph paper. 


Graphs of Time or Historical series. A time series is a 
sequence of values corresponding to successive points or periods 
of time. When data such as sales, production, employment ее 
are arranged chronologically they constitute time series. Time 
series are graphed with time on the x—axis and the variable 
under consideration on the y axis. The following points should 
be noted while plotting points :— 


1—Do not indicate plot points with circles or crosses. Use 
dots which disappear into the lines. 


2—Join points with straight lines, not curves. 


8—The time scale should be fixed carefully. 


4—Graph must be given title. 


Illustration—1 


The following table shows the sales for one year of а 
trading concern. Present the facts on a graph paper. 
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Month Sales (in 000) 
January Bs 22 
February m 24 
March a 27 
April со 29 
Мау jn 30 
June d 31 
July vi 84 
August ae 86 
September Da 87 
October ns 45 
November $i 48 
December ste 49 


Sales in Rupees (1000) 


Jan Feb Мы Apr May Jun Jul Aug, Sep Оч. Nov Dec. 

Silhouette chart. In order to emphasise the movements of 
the curve and to display the data in a striking manner charts 
are shaded as shown below. This make the graph more 
attractive. 
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False Base Line. One very important rule in drawing the 
graphs is that the vertical axis must start from zero as origin. 
But in certain cases this rule has to be violated. Where the 
fluctuations are small relatively to the size of the variable but 
the fluctuations are to be shown clearly and in those cases where 
adjustment of scales is not otherwise possible, instead of showing 
the entire vertical scale from zero to the highest value involved 
only as much of it is shown as is just sufficient for the purpose. 
That portion of the scale which lies between zero and the smallest 
value of the variable is omitted. By the use of false base line, 
minor fluctuations are magnified and they become clearly visible 
on the graph. If the size of items is big and if the vertical scale 
begins from zero, the curve would be mostly on the top of the 
paper and if the differences in the values of various items are 
not much, it would more or less be of the shape of а straight line. 
Whenever false base line is used it should be clearly indicated 
on the graph. 


False base line should be used only when it is absolutely 
necessary to do so. It is used to save space and to depict small 
fluctuations. Extra care is needed in the interpretation of such 
graphs, because they may give misleading impressions. 


Illustration—2 


Following are given index numbers of national income with 
year 1955 as the base. Plot them on a graph paper. 


1955 ae 100 
1956 е, 105 
1957 > 103 
1958 AH IL 
1959 she 112 


1960 Sa Mew bU) 
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Graph of Two or More Variables. For comparison two or 
more variables may be shown on the same graph. This is possible 
only when units of the series are the same. When there are 
more than one curve, they should be drawn in colours or in 


different types of lines. Generally following types of lines may 
be used :— 


Illustration—8 


Represent graphically the following data showing the Index 
Numbers of Money Supply, Industrial Production and Wholesale 
Prices in India during 1960 :— 
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у 


| Index Numbers of 
Months PFT 

Money Industrial | Wholesale 

Supply Production | Prices 
January 12: 116 96 
February 123 127 96 
March 128 122 98 
April 130 128 99 
Мау 180 148 99 | 
June 128 143 101 | 
July 126 154 104 
August 125 126 101 
September 123 139 101 
October 124 125 101 
November 126 135 103 
December 127 129 102 


When figures of Imports and Exports, or Income and 
Expenditure or ‘Revenue and Expenditure are to be shown 
graphically, then it is essential to show the curve representing 
balances as well. 


>; 
E 
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Illustration—4 


Show graphically the following data :— 


Year Income Expdr. Balance 
1948 16 28 —12 
1949 24 43 -—19 
1950 82 48 —16 
1951 35 47 —12 
1952 33 33 0 
1953 18 14 + 4 
1954 35 30 45 
1955 33 30 + 3 
1956 39 33 +6 
1957 49 36 +13 
1958 39 85 + 4 

E 

40 us ud 

20| 

Index 

E н Income 

10 Кр: 


1948 
1949 
1950 
1951 
1952 
1953 
[19 
1955 
1956 
1957 
1958 
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This graph can also be shown in this way :— 


If two variables are expressed in two different units then 
we will have two scales one on the left and the other on the 
right. To facilitate comparison each scale is made proportional 
to the respective averages of each. 


Illustaation—5 
Represent the following data graphically :— 


Export of Raw Hemp from India 


Year Quantity Value 
(000 cwt.) (Lakhs rupees) 
1954 665 339 
1955 342 175 
1956 271 128 
1957 417 248 
1958 342 146 
1959 364 118 
1960 426 184 


average of Quantity series is approx.—400 
average of value series is approx. —191 


М, 


й 
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E GRAPH SHOWING EXPORT OF RAW HEMP us 


Quantit 
Value 


E 500 240 
o 
S 400 m. 
= 
w 
=! 
200 96 


10 24. 


1954 1955 1956 1957 1958 1959 1960 
Years 


Value in Lakhs of Rupees 


Certain data are to be shown on the basis of percentages. This 


should be shown as illustrated below :— 


Illustration—6 


Represent the following data by a suitable graph :— 
Distribution of Rural and Urban Population of India 


Percentage of Total Population 


Year 1941 1981 1921 1911 1901 1891 
tural Areas 87 89 89.8 90.6 90.2 90.5 
Urban Areas 13 11 10.2 9.4 9.8 9.5 


(В. А. Alld.) 
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Illustration—7? 


The following table shows the highest and the lowest values 
of silver per 50 grams. Represent the data graphically :— 


Month Highest Price Lowest price 
1960 
January 80.5 80.0 
February 80.0 79.0 
March 80.9 80.0 
April 80.6 80.0 
May 79.4 79.0 
June 79.1 78.6 
July 79.4 78.1 
August 79.6 79.0 
September 80.0 79.2 
October 80.9 "79.1 
November 80.6 79.5 
December 80.0 79.6 
F Hit + 
us \СЕ CURVE! H Е HRR HH 
3 pu i Hine 
80 E E 
Н Ht 
| H 
A +H 
z 
A 
78 
0) 
purum > 
[25 < (zm <| 6 
= EE EL unciis 
MONTHS] 


An alternative method of showing such data is by means of 
small lines or bars as shown below :— 
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Band Graph: ; 
It is a Constituent-Element chart. It shows how and in what 
proportions the individual items comprising an aggregate are 
apportioned. The different component parts of the whole are 
They are distinguished by different 


plotted one over the other. 


shades. 


Illustration—8 


Show the following data by a suitable graph :— 


_ Year 
1954 


1955 ` 


1956 
1957 
` 1958 
. 1959 


(99 . 


Rice Wheat 
` 28 8 
‚ 25 9 
26 8 
28 9 
94 8 
10 


10 


_ Production in million tons 


Pulses Other cereals Total 


22 
22 
19 
20 
20 
21 


68 
66 


131 


Band graph is also known as Belt curve. 
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The Zee chart ог Z curve. This type of graph is popular 
in business rather than in statistical circles. It derives its 


name from the form made by lines on the graph. "Three curves 
are shown in such а chart :— 


(1) Curve of original data 
(2) Curve of cumulative data and 
(3) Moving total curve. 


Illustration—9 


Show the following data by Z curve. 
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ABC Company Глр. Sares Recorp, 1960 


188 


———————— ———————— 


| | Cumulative Моут 
Month | Маши Monthly POET 
| 5 Total Total 
January 9,378 9,378 138,680 
February 7,624 17,002 188,827 
March 9,810 26,812 138,965 
April 12,851 39,163 139,633 
May 14,994 53,557 140,172 
June 17,839 71,396 142,619 
July 15,674 87,070 142,206 
August 15,301 102,371 141,977 
September 12,219 114,869 143,869 
October | 10,046 124,636 144,705 
November E 8,917 188,558 144,147 
December | 11,468 145,016 145,016 
2 CHART с 
ый == чүл чы ЧЕ 4 | 


ШЕШЕП 
120 = 


"jan Feb Mar Apr May Jun Jul Aug Sep Oct. Nov. Dec. 


SALES RECORD OF ABC COMPANY LTD 1960 


Graphs of Frequency Distributions 


Frequency distribution can be represented graphically. 
Such graphs give a better picture than when the data are 
arranged in a table. 
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There are three types of graphs for presenting such a series :— 
(1) The Histogram 
(2) The Frequency Polygon 
(3) The Frequency Curve. 

The Histogram. The data are plotted as a series of 
rectangles. Class intervals are shown on x—axis and the 
frequencies on the y axis. There are as many columns as there 
are classes. The area of each rectangle represents the 
frequency of the class. Each rectangle is joined with the other 
so as to give a continuous picture. The sum of the frequencies 
is represented by the total area of the histogram. 
Illustration—10 

Represent the following frequency distribution in a 


Histogram :— 


Monthly Income Number of 
(in rupees) Families 

0 . 93 

50 57 205 

100 “+ 157 

150 “* 109 

200 85 64 

250 E 41 

300 TI 22 
350—400 1 9 


(Certificate. St. B.H.U.) 


HISTOGRAM . 


150 


No. of Families 
~ 
© 


en 
e 


0 50 100 150 200 250 300 350 400 
Monthly Income (Rs) 


GRAPHS AND ECONOMIC CURVES 135 


Histograms of Unequal Class-Intervals. If frequency 
table consists of unequal intervals, then it becomes a bit difficult 
to prepare a histogram. There will be rectangles with unequal 
width. If it is desired to keep width of all the rectangles equal 
then, height will be increased proportionately, so that area of the 
rectangle remains the same. 

Illustration—11 


The following table gives salaries of 840 clerks employed 
in a big establishment :— 


Monthly Salary Number of 
(in rupees) Clerks 
40— 50 56 
50— 60 87 
60— 70 121 
70— 80 154 
80— 90 138 
90—100 95 
100—120 112 
120—150 72 
150—200 30 


Represent the data in a Histogram. 
(M. Com. B.H.U.) 


HISTOGRAM SHOWING THE SALARIES 


ET OF 840 CLERKS 
160 
120 
> 80 
= 
E 
S 
Е 
40 
0 
$38288888 E 


Monthly Salaries in Rupees 


200 
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Illustration—12 


120 individuals firing at a moving target miss by the 
following distances, the positive (+) and negative(—) signs 
corresponding to the shot being in advance or behind the target. 
Draw a Histogram :— 


1 shot is between +10 and +15 inches wide 
3 shots are between + 5 and +10 inches wide 
20 shots are between 0 and + 5 inches wide 
25 shots are between — 5 and 0 inches wide 
12 shots are between —10 and — 5 inches wide 
17 shots are between —15 and —10 inches wide 
13 shots are between —20 and —15 inches wide 
10 shots are between —25 and —20 inches wide 
7 shots are between —30 and —25 inches wide 
2 shots are between —35 and —30 inches wide 


25 
HISTOGRAM 


15 


10 


(Shots) 


0 
-35 -30 -25 -20 -15 -10 +5 0 5 10. 15 


Frequency Polygon. The histogram can also be shown as 
a frequency polygon. To make a frequency polygon we have to 
connect the mid-points of the top of each rectangle by straight 
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lines. This is done under the assumption that the frequencies 
in a class interval are evenly distributed throughout the class. 
The area of the polygon is equal to the area of the histogram, 
because the area left outside is just equal to the area included in 
it. The size of the apex frequency will be the mode. 


Illustration—13 


Prepare a frequency polygon from the following data :— 


Wages in Rs. No. of Persons 
1—2 oe 8 
2—8 oe 10 
8—4 ЖО 
4—5 ar 24 
5—6 .. 17 
6—7 oe 14 
7—8 .. 8 
85 
25 


FREQUENCY 
POLYGON 


0 1 2 3 4 5 6 7 8 9 
Wages (in Rupees) 


Frequency Curve. А smoothed curve is drawn, generally 
freehand, through the various points of the polygon in such a 
way that the area included under it is just the same as that of 
the polygon. А smoothed frequency curve gives a regular and 
continuous curve without distorting the facts. If the curve is 
correctly drawn, it can be used for interpolation also. 


Illustration—14 
: Draw a frequency curve of the data given in the above 
illustration. 
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SMOOTHED ~ 
FREQUENCY CURVE 


m 
сл 


Frequency 
— 
© 


Wages (in Rupees) 


Frequency graph of Discrete series, In a discrete series 

/ data lack continuity. Thus frequencies are represented by 

bars or lines. The height of the bar or line represents the 
number of frequencies. 


Illustration—15 


Show graphically :— 


Rooms No. of houses 
1 oe 8 
2 10 
3 15 
4 . 20 
5 15 
6 10 
T hs 8 


The above data сап be shown graphically in the following 
ways :— 


Ф 
Ф 
СД 
3 
© 
x 
-- 
© 
ш 
© 
e 
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Graphs on Ratio or Logarithmic Scale 


The graphs so far disucssed have been drawn on natural or 
arithmetic scale. In such graphs у—ахїз represent equal 
absolute magnitude, hence only absolute changes in the values 
of a variable are shown. Ratio scale is used to study relative 
changes in the values of a variable. It tells about the rate or 
ratio of change. 


The relative changes can be studied graphically in the 
following two ways :— 


(1) By plotting the logarithm of the given values on 
а natural scale. 


(2) By plotting the given values on a logarithmic scale. 
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In the second method only the verticle scale is in logarithmic 
ratio. The horizontal scale is in arithmetic ratio. That is why 
it is called semi logarithmic graph. In the words of James 
A Field, “It is far superior to the natural scale for effecting 
comparison when very small and very large quantities must be 
taken into account concurrently." 


The semilogarithmic and the arithmetic scales are so 
dissimilar as to give quite different graphic descriptions of the 
same data. It is clear from the following example :— 


Investment of two sums, 100 and 1000 at 10% compound 


interest. 
Year А В Log A Log B 
1 100 1000 2.00 3.00 
2 110 1100 2.04 3.04 
8 121 1210 2.08 8.08 
4 188 1880 2.12 3.12 
5 146 1460 2.16 3.16 
6 161 1610 2.20 3.20 
7 177 1770 2.24 3.24 
| 8 195 1950 2.98 3.28 


From the above graph it appears that the sum of Rs. 1000 
is increasing at a higher rate than the sum of 100, though the 
rate of change is the same. 


Sums of Rs. 100 and Rs. 1000 rising аё Compound. 
Interest rate of 1095 (Semi-Log Paper) 


Rupees 
1800 


1500 


Years 


Sums of Rs. 100 and Rs. 1000 rising at 
3-4 compound Interest rates of 10% 
(Logs on the natural scale) 
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Sums of Rs. 100 ата Rs. 1000 rising at Compound 
Interest rate of 10% от natural scale (Semi-Log Paper) 


Amount ( Rs) 


Illustration —16 


Plot the following figures relating to population of India 
so as to show the proportionate increase in population from one 
period to another. 


Year Population 
(000,000 omitted) 
1871 .. 210 
1881 WAT Ea 250 
1891 .. 290 
1901 oe 295 
1911 .. 315 
1921 .. 320 
1931 ў e 850 
1941 a 27 390 


(В. Com. Nagpur) 
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Population Log Population Log 
210 2.8222 315 2.4988 
250 2.8979 320 2.5051 

. 290 2.4624 350 2.5441 
295 2.4698 890 2.5911 
(Logs 
2:6 
2.4 
Е 
2.2 
3. 


1871 Е188141189[ 10011911 1021+: 19314194 
years 


Following rules must be kept in mind while interpreting 

semilogarithmic graph :— 

1. (a) A series increasing at a constant proportional rate 
ie, a geometric progression will plot as а straight 
line with an upward slope. 

(b) A series increasing at a decreasing rate will plot as 
a curve concave to the base. 

(c) A series increasing at an increasing rate will plot 
as a curve convex to the base. 


2. (a) A series decreasing at a constant proportional rate 
will plot as a straight line with a downward slope. 
(b) A series decreasing at a decreasing rate will plot 
with a downword slope convex to the hase. 
(c) A series decreasing at an increasing rate will plot 
with a downward slope concave to the base and will 
rapidly approach, but never reach zero. 
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3. Two series increasing or decreasing at the same rate 
will plot as parallel lines on the ratio scale. 

4. There is no zero line. 

5. If two curves or two segments of the same curve vary in 
slope, the steeper is changing at a faster percentage rate. 


Economic Curves 


There are certain economie laws which are expressed 
through curves. А law in Economics is a statement of general 
tendencies. Statisticians have devised various equations to 
explain different laws of Economies. These equations have been 
developed on the assumption that there is some functional 
relationship between two phenomena. When the relationship 
between two variables is one of complete dependence it is said 
to be functional relationship. Thus, if the value of ‘у’ is 
determined by given value of ‘x’, y is said to be a function of x. 
Тһе general expression for such a relationship is y—f (x). 
If value of y is always two times that of x then y=2 (x). > 
this case x is a independent variable and y is a dependent 
variable. When this functional relationship is expressed on a 
graph independent variable are shown on x—axis and dependent 
‘variable are shown on y—axis. 


The Straight line. If two Variables are зо related that 
their values are always the same, their. relationship will Бе 
expressed by the equation y—x. If such values are plotted on а 
graph paper, and a line drawn through the plotted points, i 
will be a straight line passing through origin and bisecting the 
right angle ‘хоу’. Below is given an example of such a graph. 


у А 
when 2—0 y=0 
E a=1 $971 
>i a=2 у=2 
» 2—3 y=3 
m a4 y= 
> a=5 у=5 


10 
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An equation of first degree also discloses a straight line 
relationship. Equation of first degree is y—a--bx, where ‘а’ is a 
‚ constant representing the distance from the origin to the point 
of intersection of the given line and the y—axis, and 'b' is a 
constant representing the slope of the given line. If we have to 
draw a graph of equation y—a-L-bx where a—1 and b—2. Then 
: the equation у—а-- 5х will become y=1+-2x. From this relation- 
е ship we сап frame a series and make a graph of that. 


jy—a--bz—y-—1--2x 


when 2=0 у= 1 
» =l х y= 3 х 
> #=2 у= 5 
E 2—3 { у=! t 
» &—4 j 9 


» к=» y—11 
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Non-Linear Relationship. Non-linear functions are of 
many forms, of which only a few of the more common аге 
discussed here. Functional relationships of the parabolic or 
hyperbolic form are quite common in physical sciences. Such 
curves are also found to fit in certain classes of economic data. 
Parabolic curves are used to represent data to which laws of 
increasing and decreasing returns apply. Demand curves and 
utility curves are also parabolic in nature. The general equation 


in such cases is у—ах. The curve is parabolic when the 
exponent b is positive and hyperbolic when b is negative. The 
following examples will illustrate them. 


Graph of the function y=2* 


when а 2220. then у=25 

» а= —4 3x y=16 

E zB » у= 9 

P z= —9 > I= 4 

” z= —1 » y= 1 

E t= 0 > g= 0 

» == 1 » ‚у= 1 

E a pa ” y= 4 

3 a= 3 is y= 9 

$ "s а > у=16 
f , ' 26, > y=25 


= ” 
The graph of this series will be :— 
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This is a parabolic curve. In such relationship an important 
characteristic is that if x variable increases in geometric progression 
y variable also increases in geometric progression. 


- 1 
Hyperbolic curve is obtained by the equation y= x^ . 

when.  z—!/; y= 8.when «= 2 y=" 
P z=, y= 2 » , 4—8 y="/3 


» a= 1 у= 1 


YPERBOLIC CURVE E 
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Thus the relationship between two variables which increase 
by constant increments may be represented by a straight line. 
The relationship between variables increasing in geometric 
progression may be represented by either a parabola or a 
hyperbola. There is another type of curve known as exponential 
curve. This curve constitutes a hybrid type. The equations of 
exponential curve are y—a* and y—ab* 


у=а* if a—2 then y—2* 


ото юнон 
со > ® = 


The graph of y—2* will be :— 


EXPONENTIAL CURVE 
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The other equation is y—ab* if say a=1 and b—3, then the 
üeties will be ў (138 > 


4— (1X3?) 
T y 
1 8 
2 9 
8 27 
4 81 
5 243 


The graph will be 


EXPONENTIAL CURVE H 


x yccab* 
200; 
150 | 
| 100 
50 
О 1 : | | : 4 Ж 


In exponential curves x variable increases in arithmetic 
progression and y variable in geometric progression. Malthus's 
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theory of population сап be illustrated with the help of such a 


curve. 


This equation y—ab* may also be written like this in 


logarithmic language. 
Log y—Log a+x log b 
and y—x? Can be expressed as Log y—2xlog x 


There are certain equations which illustrate Law of 
decreasing Cost or Increasing Returns, Law of Demand and Law 
of utility. The following equation illustrates Law of Increasing 


returns and also Law of demand. 


Assuming that x represents the market prices of a commodity 
and y the quantities of the commodity demanded at the given . 
prices, construet a demand curve satisfying the following 


equation. 
Log y=2—0.3x 


Or 
y—antilog (2—0.3x) 


For x year may assume the values of Re 1/-, Rs 2/-, Rs 3/-, 
Rs 4, and Rs 5 to arrive at the corresponding values of y before 
plotting the Demand curve. (M. Com. Raj.) 


y=Antilog (2—0.84) 

y 

АТ, of 2—.8Ж1=1.7=50.12 
2—.3Ж2=1.4=25.12 
9—18xX(8—1.1— 12.59 
2—.3Ж4—=0.8= 6.31 
2—.3Ж5=0.5= 3.16 


оъ ою уо н = 


The graph will be 
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Hun Hd 
Hub 
zn 


Second and third degree curves are also very common in 
illustrating certain economic behaviour. The equations for 
such curves are :— 


y=a-+bx-tcx? (2nd degree) 
y=a-bx-+cx?-dx® (3rd degree) 
These curves represent law of Diminishing returns or 
increasing cost. The series based on these equations is called 
potential series. 


Draw an Increasing cost curve the equation to which is 
y=a-+bx-+cx* where а=15, b—2 and c=3. 
The equation then becomes 
y—15-L-2x4-3x? 
The series based on this equation will be :— 
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& y а y 
0 15 3 48 
1 20 4 71 
2 81 5 100 


Draw а сигуе to show that y decreases with x in terms of 
the following equation. 
y—a--bx-r-ex? a—100 


b=—2 
c=—3 
Equation Ъесошез=у=100-—22—82* 
а ; т y 
0 100 Ue: 67 
1 95 4 44 
2 84 _ 5 15 
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Third degree parabolic curve. What is a potential series ? 
Construet a curve based оп the following relationship. 
y—a--bxJ-ex?--dx? when the values of a, b, c and d, are 
respectively 12, 3, 2 and 1. 

(M. Com. Vikram) 


The equation becomes y—12-]-32--2a?-- 1a? 
2 


лхо моно 
a 
[2] 
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y=12+3x+2x?+1x3 


12068 ЧН ud 


100 88 i 


Theoretical Questions 


1. Write a short essay on the use of graphic method in 
Statistics. (M. A. Calcutta) 


2. Explain with the help of diagrams the difference between 
a frequency polygon, histogram, frequency curve and ogives. 
(М. A. Patna) 


8. “The wandering of a line is more powerful in its effect on 
the mind than a tabulated statement.” Discuss. 
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Practical Questions 


1. What is the position of graphs in the exposition of 
statistical problems ? Represent the following graphically : 


Age-wise classification of workers in a factory 


Age-groups 
20 but less than 25 
25 > » 5 30 
80 „ » » 85 


2. Represent graphically 


Number of workers 


(B. A. Alld.) 


Indian Export of Mica 


Year Quantity (000 cwt.) 
1948-49 340 
1949-50 298 
1950-51 407 
1951-52 408 
1952-53 284 
1953-54 255 
1954-55 357 


Value (Lakhs of rupees) 
594 


685 
1,000 
1,321 
901 
800 
659 


3. Represent the following graphically :— 
Export or Inpian Raw Соттом 


IN THOUSANDS OF TONS 


Year Total Exports 
1926-27 569 
1927-28 479 
1928-29 663 
1929-30 727 
1980-81 701 
1981-82 428 
1982-88 865 
1988-84 504 
1984-85 623 
1935-36 606 


Exports to Japan 
221 
287 
293 
329 
301 
193 
194 
197 
367 
314 
(B. Com., В. Н. 0.) 


4. What are the advantages of the Ratio Scale over the 
Natural Scale? Plot the following data graphically on the 


Logarithmic Scale :— 
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Total Notes Notes in 
Year issued in circulation 

crores of Rs. crores of Rs. 
1933-34 177 167 
1934-35 186 172 
1935-36 196 167 
1936-37 208 192 
1937-38 214 185 
1938-39 207 s 187 
1939-40 252 287 
1940-41 269 258 
1941-42 421 410 
1942-43 650 ү 625 


(B. Com. B.H.U.) 


5. Represent graphically the exports and imports of India 
from the following table on the natural as well as on the ratio 
вое = 


Year Exports Imports 
1929-30 Ji 345 258 
1930-31 ce 308 206 
1931-32 Ета 268 176 
1982-88 ete 239 203 
1933-34 .. 275 182 
1934-85 zs 280 210 
1935-36 sit 282 216 
1936-37 Ss 248 199 
(M. A., Agra) 


i MM 
6. Distribution of weekly index | Distribution of weekly Index 
numbers of cost of living numbers of cost of living 

. in Bombay 1942 _ in Bombay in 1948 
Index No. No. of weeks Index No. | No. of weeks 


140—150 5. | 200—210 10 
150—160 10 210—220 10 
160—170 20 | 220—230 10 
170—180 9 230—240 8 
180—190 | 6 240—250 7 
190—200 _ 2 250—260 . 7 


Represent the distribution graphically 
(M. Com., Agra.) 
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7. Represent graphically the data given below on а 
sheet of graph paper to bring out clearly the relative fluctuations 
in the prices of various articles. Draw such conclusions as you 
can from the graphs, 


Wholesale prices in Kanpur 
(in rupees per maund) 


Year Rice Wheat Linseed Gur Cotton Tobacco 


1928  7'8 Wane 70 65 84'1 17'8 
19290 — 7'7 5°5 8°0 7°83 29°8 11°83 
1980 5'8 8*6 6'5 62 175 14°5 
1981 41 2'7 42 4*2 13°3 11°6 
1932 4°3 8'4 8'5 3°5 14° 8 4'9 
1933  4'1 5:2 384 31 12°9 4° 2 
1984 37 2*8 3°6 41 15°2 57 


(М. Com., Alld.) 


8. The following table gives the population of four towns at 
the time of last seven census :— 


Figures in thousand 


Allahabad Agra Benares 
1881 160 160 218 155 
1891 175 169 228 194 
1901 172 188 218 | 203 
1911 E 172 164 204 179 
1921 157 164 198 216 
1931 184 205 z 205 243 
1941 261 284 263 287 


Plot the aboye figures on a graph paper and estimate the 
population of each town for 1946. 


(М. A., Арта.) 


GRAPHS AND ECONOMIC CURVES 159 


9. Represent the following data graphically and comment 


upon their relationship, if any :— 


Year 


1920 
1921 
1922 
1923 
1924 
1925 
1926 
1927 
1928 
1929 
1930 
1931 
1932 
1933 
1934 


(erores of acres) 


Area 


Production 
(Lakhs of tons) 


140 
142 
150 
160 
160 
158 
152 
155 
165 
170 
169 
165 
158 
153 
157 
(M. S. W. Luck) 


10. Represent the following frequency distribution graphi- 


cally :— 


Class 
0— 20 
20— 40 
40— 60 
60— 80 
80—100 


Frequency 


(B. Com. Nagpur) 


11. The following table shows the total sales of gold by the 
Bank of England on foreign account. Represent the data 


graphically on the logarithmic scale :— 


Year 
1910 
1911 
1912 
1913 
1914 
1915 
1916 


£7000 
14,488 
8,228 
9,670 
7,948 
8,027 
43,076 
2,360 


(B. Com., Allahabad.) 
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12. Represent the following frequency distribution by means 
of a graph. Construct the cumulative frequency curve also. 


Class interval Frequency Class interval Frequency 
0— 5 13 30—35 250 
5—10 42 35—40 287 

10—15 188 40—45 185 
15—20 237 45—50 42 
20—25 250 50—55 18 
25—30 256 


13. The populations of three towns in U, P. at the time of 
the last seven censuses are given below in thousands : 


Year Jhansi Saharanpur Bareilly 
1891 54 63 123 
1901 56 66 183 
1911 76 63 129 
1921 75 62 129 
1931 93 79 144 
1941 108 108 198 
1951 106 148 195 


Estimate graphically the population of these towns in 1956. 
PCT S) 


. 14. The following table gives the prices of gold and wheat 
and net export of gold during the years 1931-82 to 1938-39 :— 


Average price Average price Net export 


Years of gold of wheat (per of gold 
(per tola) maund) (crores of Rs.) 
Rs. as. Ез. 
1981-82 25 4 3.3 58 
1932-33 30 12 3.3 ` 65 i 
1933-34 88 6 2.8 572590 
1984-35 35 8 8.1 52 
1985-86 85 4 3.2 37 
1986-87 36 0 3.9 28 
1937-38 36 6 3.0 16 
1938-39 37 12 9.4 13 


Plot the above figures on a graph paper and comment upon 
the relationship. 
(M. A, Agra) 
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15. Following table gives the production of sugar in Cuba, 
Java and (undivided) India during 1930-39 in millions of quintals. 
Represent the figures by а suitable diagram and comment on their 
relationship. 


Year Cuba Java India 
¿ 1929-80 44 29 17 
1930-31 80 28 20 
1931-32 25 26 24 
1982-88 19 14 28 
1988-84 22 6 30 
1934-35 25 5 31 
1935-36 25 6 36 
1936-37 29 14 40 
1937-38 29 14 32 
1938-39 26 15 27 


(М. А., Patna) 


16. The following table gives the proportion of married 
women in 1910 and in 1920 from women of every age. Show 
graphically that the increase was most marked for the women of 
younger years. 


Age Married Married 
Women% Women, 

1910 1920 
18 17°0 19°2 
20 36'2 88'4 
22 50°7 52°9 
24 62°0 64°2 
25 65°7 67°8 


(B. Com., Nagpur) 


11 


CHAPTER 8 


STATISTICAL AVERAGES 


*A single number describing some feature of a set of data is 
called a descriptive statistics or more commonly averages." 
Warris & ROBERTS 
Meaning, The data collected are condensed into tables. 
Statistical tabulation arranges data in logical order and helps 
the understanding of their real significance. But very often 
tabulated data are too large for their real significance to be 
grasped. In the words of R. A. Fisher, “The inherent inability 
of the human mind to grasp in-its entirety a large body of 
numerical data compels us to seek relatively few constants that 
will adequately describe the data.” For that, the data need 10 
be condensed to a single figure which is typical and fully 
representative of the entire mass of data. This representative 
figure is called the ‘average’ or ‘measure of central tendency’ or 
‘most typical value’ or ‘constant’. According to Dr. Bowley, “An 
average is purely a mathematical conception, such as the average 
length of life in a varied population which does not correspond 
to any particular group, but is only a short way of expressing 
an arithmetical result.” According to D. C. Jones, “an average 
may be regarded as one of a class of statistical constants which 
concisely lable a set of observations or measurements pertaining 
to a common family.” Thus ‘Averages’ occupy an important 
place in Statistics. This is the reason that Dr Bowley has 
defined statistics as the science of averages. It is because 
‘averages’ summarise the salient features of most data so usefully 
that they are widely employed in statistics. Although this 
statement of Dr. Bowley is a little misleading in so far as 
averages form only a part of the techniques employed in 
statistical enquiry. An average sets out in a single quantitative 
measure, the most representative typical value or position 
along the scale upon which the whole distribution of values 
centres. The average will satisfy the condition of being such a 
function of the entire group of values that if all the groups 
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happen to be equal to each other, then the average equals each 
one of the group. 


Functions of Averages. Functions of an average are :— 


1—It is difficult to assimilate a mass of detailed information 
expressed in numerical form, even when it has been substantially 
reduced by tabulation. An average makes it convenient to 
express the data in more abbreviated numerical form, yet in such 
a way that the salient features of the tables are clearly brought 
out. Thus an average gives a definite and precise idea of a 
large group of numerical items. 


2—TIf all the significant features of the data can be brought 
out by one or two figures, the comparison of groups or classes 
is a far simpler task than the detailed scrutiny of the data. The 
averages which are also called ‘summary figures’ or ‘summary 
numbers’ make the process of comparison simpler and shorter. 


3—Averages also help to obtain a picture of a complete 
group by means of a sample data. In statistical enquiries very 
frequently sample method is used. The mean of a sample gives 
а good idea about the mean of the population. 


4—When it is desired to trace the mathematical relationship 
between different groups or classes, an average becomes essential. 
Simply saying that expected life of an average Englishman is 
more than that of an average Indian, is something abstract and 
vague.  Definiteness сап only come if expected lives are 
expressed with the help of some average. 


Essentials of а Satisfactory Average. According to 
professors Yule and Kendall, an average should possess the 
following properties :— 


1—Ап average should be rigidly defined. If it is not well 
defined then there are chances of its being influenced by the 
Observer's own intelligence and bias. 


2—It should be based on almost all the observations made. 
If the average is not based on all the items of the variable, it 
will not be represantative of the whole group. 


3—1% should possess properties obvious and simple for 
comprehension. An average should not be so abstract and 
mathematical as to become incapable of being understood easily 
and rapidly. 
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4—It should be calculable with reasonable ease and rapidity. 
An average should be capable of being calculated quickly and 
easily. 


5—It should be affected by fluctuations of sampling as 
little as possible. Of course, proper care should be taken to 
draw the sample correctly. If however, two sample averages 
are computed of which one shows greater differences from the 
item values of the series, the more stable one should be taken 
up as representative of the series. 


6—It should lend itself readily to algebraic treatment so that 
its use may be made for further mathematical treatments. 


Kinds of Averages. The following averages are of great 
value in statistical studies :— 


i—Arithmetic Average. or Arithmetic Mean or Mean. 
It is represented by symbol a 

2—Median. It is represented by symbol M 

3—Mode. It is represented by symbol Z 

4—Geometric Average. or Geometric Mean. It is 
represented by symbol G. M. 

5—Harmonic Mean. It is represented by symbol Н.М. 

6—Quadratic Mean. It is represented by symbol Q. M. 

I—Moving Average. 

8—Progressive Average. 

9—Composite Average. 


ARITHMETIC AVERAGE 


This measure of central tendency is very common. When 
we talk about representative value, it is natural that we should 
think of the ordinary average or more correctly stated the 
arithmetic Mean or simply mean. It is obtained by adding up 
all the items and dividing the sum by the number of items. 


The arithmetic average is of two kinds—(1) The Simple 
Arithmetic Average (2) The weighted Arithmetic Average. 
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Simple Arithmetic Average 


(A) Individual Series :— 


Simple arithmetic average of an individual series is 
calculated by the following formula. 


qM FEDES еттін Xn 


4 zx 
hort a == 
or in sho a N 


The caleulation of arithmetic average by the above formula 
is called Direct Method. There is another method of calculating 
arithmetic average which is called short-cut method. Short-cut 
method is based on the fact that deviations from the actual 
average are equal to zero. In this method some average is 
assumed and deviations are found out with + and — signs, and 
they are totalled. In short-cut method the following formula 
is used. 

> dx 
a=x+ EN 
Where x — assumed average 
У ах = Total of the deviations 


N — number of items. 
Illustration —1 


The monthly income of ten families of a certain locality are 
given in rupees as below :— 
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Calculate the arithmetic average by (a) Direct Method and 
(b) Short-cut method. 


(Agra. B. Com.) 


Direct Method :— 


Income x 


85 


70 


tk © КЕБЕ Ou» 


У х==1116 


эх! 116 


атр = 10 RA 111.6 
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Short-cut Method— 


— 


Family Income x dx (Deviations from 
assumed average 100) 


A 85 — 15 
B 70 — 80 
Cc 10 — 90 
D 75 — 25 
E 500 --400 
F 8 — 92 
G 42 — 58 
H 250 +150 
I 40 — 60 
J 36 —64 
N—10 жк TUE 
=+116 


dx 116 
a—x e —100-. 9 —100--11.6 —Rs. 111.6. 


(B) Discrete Series—In a discrete series items are grouped 
according to their size. It shows the distribution of items 
among various measurements. 


The arithmetic average in a discrete series is calculated 
by the following formula :— 
Direct Method— 
> mf 
N 


Where у mf—the total of the products of the frequencies 
(f) with their respective measurements (m) 
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Shortcut Method— 


fd: 
ant 


where »fdx—the total of the products of the deviations from 
the assumed average and the frequencies of the items. 


Illustration —2 
Caleulate the mean of the following data :— 


Size of the item—11 12 13 14 15 16 17 18 19 20 
Frequency ати € 10 1^8; 4- SF 


Direct Method— 


m f mf 
11 1 11 
12 2 24 
18 4 52 
14 7 98 
15 10 150 
16 11 176 
17 8 186 
18 4 72 
19 2 38 
20 1 20 
Total N=50 3 mf=777 
л. 
mf 777 
= zn = — = 15.54 


STATISTICAL AVERAGES 169 


Short-cut Method— 


m f dx. from assumed fdx 
average 15 
11 1 —4 — 4 
12 2 —8 — 6 
18 4 —2 — 8 
14 7 —1 — ў 
а 10 0 0 
16 11 +1 +11 
17 8 +2 +16 
18 4 T3 T12 
19 2 +4 +8 
20 1 +5 + 5 
Total N=50 +27 
УД 
а=х+- ziti =15+ миь =15.54 


(C) Continuous Series—While caleulating mean from a 
continuous series the same formula is used as is used for а 
discrete series. It is assumed that the frequencies are 
identical to the middle points of the class intervals. 
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Illustration—8 


Caleulate the mean rent of the following distribution :— 


—M—————————————7 


Rent 
(in rupees) Frequency 
m lo ЭЗ 
20— 80 16 
80— 40 24 
40— 50 59 
50— 60 100 
60— 70 41 
70— 80 81 
80— 90 19 
90—100 10 
Total 800 


Direct Method— 


m Mid value # mf 

1 m.v. 8 (2X8) 
2 

20— 30 25 16 400 
80— 40 35 24 840 
40— 50 45 59 2,655 
50— 60 55 100 5,500 
60— 70 65 41 2,665 
70— 80 75 31 2,325 
80— 90 85 19 1615 
90—100 95 10 950 
Total N=300 16,950 

ИШЕНДИ Bel WU Ede кызл. Sa ee о 
mf 16950 
= aco ——— — —b6.50 Rupees. 


"UN CT 300 
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Indirect Method— 


-+410 
-+620 
4-570 


4-400 


| 


Total 300 +450 


> fdx 450 
a—x. — = 
dO NOCHE MS 
To make the caleulations still more easy, deviations may 
be divided by a common factor. It is called step deviation. 
"The formula becomes 


a=x +} P. Xi (common factor) 


=55--1.5=56.50 rupees. 


Illustration —4 
Given the following frequency distribution. Calculate the 
Arithmetic average by Direct and shortcut Methods :— 


Monthly wages No. of workers 

in rupees 

12.5—17.5 .. .. 8 
17.5—22.5 oe os 22 
22.5—27.5 .. .. 19 
27.5—32.5 n oe 14 
32.5—87.5 .. oe 8 
87.5—42.5 .. .. 4 
42.5—47.5 .. В 6 
47.5—52.5 os 


1 
(М. Sc. Punjab) 
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No. of common 
Monthly wages | m. | > fdx 
Ез. Value eua mf |dx(80)) 5/5 ©) 


(1) | (в) | | @) | @ | 0 


12.5—17.5 15 8 45| —15 | — 45 —8 


17.5—22.5 20 22 |440 | —10 | —220 —2 


22.5—27.5 25 19 | 475|— 5 | — 95] —1 
27.5—82.5 80 14 |420 0 0| 0 
82.5—87.5 | 35 8 |105) +5 |+ 15) +1 


87.5—492.5 | 40 4 |160|--10 | + 40 +2 
42.5—47.5 | 45 6 |270| --15 |+ 90) +3 
47.5—52.5 50 1 50 | +20 | + 20) -r4 
Total 72 |1965 —195 
mf 1965 
By Direct Method a= = =21.29 Rupees. 


fdx —195 
By Short cut Method :—a—x’ Li C .; "ES ГУТ, 
АИТ 


=30—2.71 
=27.29 Rupees. 
By Short cut (Step Deviation) Method 
,, ах 
а—х’-- TENGO 
—89 
=30-+-__ 
Hr тә X5 
=80—2.71 


=—=21.29 rupees. 


fdx’ 
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Illustration—5 
Find the mean from the following figures :— 

Marks Number of students 

Below 10 o] v b 
ый. 20 T 2 9 
Sous S0 17 
s^ 0 29 
b^ 50 45 
w-. 60 60 
E 70 70 
P. FSI 78 
» 90 we we 88 
„ 100 X ME 85 


(Note—The series should first be arranged in a class interval 
series.) The series will be like this :— 


Marks 


0— 10 5 —25 
10— 20 4 —16 
20— 30 8 —24 
30— 40 12 —24 
40— 50 16 —16 
50— 60 15 0 
60— 70 10 --10 
70— 80 8 +16 
80— 90 5 +15 
90—100 2 +8 

Total 85 —56 
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f 
Е xi —55-- ^ х10 


—b5— 6.59 
—4841 marks. 


Charliar's accuracy check. The accuracy of the calcula- 
tions of the arithmetic average can be checked by the following 
formula given by Charliar— 


zídx—z[ f(dx+1) }—хї 


In Illustration 5 И--1 is added to deviations they would 
become-— 


dx ; f(dx--1) 
—b5--1 = —4 —20 
—4+1 =—3 —12 
—8--1 = —2 G0 
21 e —1 —12 
© 0 

О +15 
+1+1 = +2 4-20 
+21 = 8 +24 
Fasti = +4 +20 
+441 = 5 +10 
Total -+29 
= —56=29—85 
= —56——56 Hence the calculations are correct. 


Ilustration—6 


Find the mean from the following data :— 


Marks No. of Students 

Above 0 AS ^ 80 
$12 010 M oh 77 
5: tao p! 3s 72 
КИРЕ x x 65 
s n ÁO. rs m 55 
EE А ч 48 
60 , 28 
s mo 5 ox 16 
Rae ДАУ 5 10 
КИЙ Т, ; 8 


* 0 
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(Notes—This series will be rearranged in class intervals like 
0-10, 10-20, and frequency of the different class interval will be 
found out by deducting the frequency of the previous total 
frequencies.) 


Marks f£ | шу. | dx45  |dx/(cf.—10) fds’ 
0— 10 3 5 —40 I: —12 
10— 20 5 15 —80 —8 —15 
20— 30 " 25 —20 —2 —14 
80— 40 10 85 —10 =—1 —10 
40— 50 12 45 0 0 0 
50— 60 | 15 55 4-10 Ly +15 
60— 70 12 65 +20 +2 +24 
70— 80 6 75 +30 +3 +18 
80— 90 2 85 +40 +4 | +8 
90—100 8 95 +50 +5 +40 
Total 80 " +54 
а=х4- аа Х1=45-- s x10 
456.75 
=51.75 marks. 
Illustration —7 


Compute the average wage for the following frequency 
distribution of wages :— 


15 | 20| 25 | 30 85| 40 45 | 50 | 55 


Central wage—Rs. 


Wages earners—No. 
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(This series needs rearrangement. Central wage given is 
mid value, hence class intervals will be found out) 


————— 


m m.v. F dx(35) fdx | 
| 
12.5—17.5 15 2 —20 T) 
17.5—22.5 20 ee} |g лы —330 | 
22.5—27.5 25 19 —10 —190 | 
27.5—32.5 30 14 — 5 — 70 | 
32.5—87.5 35 3 0 0 | 
37.5— 42.5 40 4 + 5 + 20 
42.5—47.5 45 6 +10 4 60 | 
47.5—59.5 50 1 +15 + 15 | 
52.5—57.5 55 1 +20 -L 20 | 
А | 
Total 7 | —515 | 
| 
ах pan 
acx4 7D = 4S 
= 35 — 7°14 


= 27°86 Rupees. 
Illustration—8s 


The following are the monthly salaries in rupees of 80 
employees of a firm :— 


The firm gave bonus of Rs 10, 15, 20, 25, 30, and 35 for 
individuals in the respective salary groups :— 
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Exceeding 60 but not exceeding 75, exceeding 75 but not 
exceeding 90 and so on upto exceeding 135 but not exceeding 150. 
Find out the average bonus paid per employee. 

(B. Com. B. H.U.) 


(First we have to find out the number of employees getting 
salaries in the class intervals given above) 


Frequency Table 


—————————————————————— 


Exceeding Not Exceeding No. of Bonus paid 
Employees 
60 7b 3 10 
75 90 4 15 
90 105 5 20 
105 120 5 25 
120 135 7 80 
185 150 6 35 
Total 30 | 
M M M——— 
Bonus paid No. of workers f mf 
m 
10 3 80 
15 4 60 
20 5 100 
25 5 125 
30 T 210 
35 6 210 
Total 90 735 
t n 
amt шы uk Rupees 
uw mne Torren 
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Illustration—9 


Make a frequency table having grades of wages with class. 
intervals of two annas each from the following data of daily 
wages received by 30 labourers in certain factory and then 
compute the average daily wages paid to labourers :— 


Daily wages in annas :—14, 16, 16, 14, 22, 13, 15, 24, 
12, 23, 14, 20, 17, 21, 18, 18, 
19, 20, 17, 16, 15, 11, 12, 21, 
20, 17, 18, 19, 22, 23. 


(The Series will be arranged as desired in class intervals) 


| 
та f m.v. mf 
11—18 8 12 86 
13—15 4 | 14 56 
15—17 5 16 80 
17—19 6 18 108 
19-—21 5 20 100 
21—23 4 22 88 
28—25 Б] 24 72 
Тоїа1 80 540 
smf 540 
— a —— = 18 Ав. or Hs las 2. 
м в 


Combined Arithmetic Average. If the means of different 
samples are given, combined mean can be found out with the help 
of the following formula :— 


f£124--f585--fsag--fa84 + + + авт 
ER WEN AER f. 


where fı, fs, f, ete are the frequencies of different groups and 
ад, ао, аз, a4 etc are the arithmetic average of those groups. 


Combined Mean — 
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Illustration—10 


A distribution consists of three components with total 
frequencies of 200, 250 and 300 having means of 25, 10 and 15 
respectively. Find out the mean of the combined distribution. 


(M. Com. B. H. U.) 


Sample Frequency Mean 
(1) 200 25 
(2) 250 10 
(3) 300 15 

f,a,+f,a.+ fsa, 
Combined Mean — Мз ЫАЛ НЧИ 
NE AE 
(2003«25) + (250510) + (300X15) 
y 200-1-250-1-300 
5000-1-2500-1-4500 
ps GENETIC 
putent 
150 


Weighted Arithmetic Average. Weighted arithmetic 
average plays a very important part in economic studies. 
Weighted arithmetic average is that average which is obtained 
by applying to the items weights as judged by their relative 
importance. In certain cases weighted arithmetic average is the 
only suitable method of comparison. For instance if we want to 
know the average expenditure of a College in Calcutta, it would 
be wrong to find it by adding the expenditures of all the colleges 
and dividing it by the number of colleges. This would give a 
college with 200 students the same weight as a college wit 2000 
students. Weights are assigned to different items according to 
their relative importance. The formula for ealeulating weighted 


arithmetic average is— 


: zxw zfdxw 
Direct Method :— а „ = зу — 


Short-cut Method :—a,, =x-++ 
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Illustration—11 


A candidate obtains the following percentages in an 
examination : English 73, Economies 82, Accounts 57, Business 
Administration 62 and Commerce 60. Find the candidate's 
weighted mean marks if weights of 4, 3, 3, 1, 1, respectively are 
allotted to the subjects. Find his unweighted mean as well. 


Marks | Weights | 
Subjects x w xw dx(60) dxw 
English 78 4 292 +18 +52 
Economics. 82 3 246 +22 +66 
Accounts 57 3 171 — 8 ad 9 
Bus. Admn. 62 1 62 IPIS +2 
Commerce 60 1 60 0 0 
334 12 831 Eu 1 
Siniple CE = = o = 66°8% 
Weighted a (Direct Method) — = 2 aur = 69°25% 


Weighted a (Shostedb Method) ep ene 
УМ 


= 60--9*25 j 
= 69°25% 


Illustration—12 


The following table gives the results of certain examinations 
of three universities in the year 1957. Which is the best 
university? 
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Percentage results in the University 


ge Apt ou 
Exam. A | В | С 
М.А. 80 70 | 70 
M.Sc. | 65 70 | 60 
В.А. 70 80 70 
B.Sc. | 60 70 80 
B.Com. 75 | 60 70 


(М. A. Calcutta) 


The Simple arithmetic average of 


80-L65--70--60--75 350 


A University ————— Аи = 10% 
70-+-70+-80-+-70-+60 350 

B University = Amie = 70% 
70--60--7 

С University = eres катая Ne pec = 10% 


5 


Here simple arithmetic average is unable to give any 
indications about the standard of A В and С universities, 
because it is same for all the three universities. Hence it is 
desirable to assign weights. Weights may be assigned on the 
assumption that the number of students appearing at 
B.A. examination is more than the number of students 
appearing at М. А. examination. It is also assumed that there 
are more students in Arts than commerce, and there are more 
students in commerce than science. It is also assumed that 
number of students differ in three universities. 


A University B University C University 
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Weighted Arithmetie Averages 


А University B University C University 

T x XiWi — axis ZX2Wo SEN У хзүз 

ы Swi z Zw 2 Dwg 
.. 2080 .. 2610 . 8900 | 
RENTE ТИМЕР; 55 
=70.0% =70.54% =70.90% 


According to the weights assigned C University is the 
best. 
Illustration —138 

Taking an imaginary example show the condition under 
which— 

(i) a—wa, (ii) awa, (iii) a <wa 

(M. Com. B.H.U.) 

(i) The simple arithmetic average will be equal to the 
weighted arithmetic average when all the items are assigned 
equal weights. 

Gi) The simple arithmetic average will be greater than the 
weighted arithmetic average when items of small values are 
given greater weights and items of big values are assigned less 
weights. 

(iii) The simple arithmetic average will be less than the 
weighted arithmetic mean when items of small values are given 
less weights and items of big values are given more weights. 

This can be proved by the following example :— 


Condition (i) Condition (ii) Condition (iii) 


RR 
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Simple average= n = x =40 Units 
Weighted average under second Condition :— 
pete aud 2000 =40 Units 
Zw 25 


(This is equal to simple arithmetic average) 40—40 
Weighted average under Second Condition :— 
Уху 500 


NE D gem e i 
sw 15 =32.67 Units 


(Simple arithmetic average is greater than weighted 
arithmetic average 40>32.67) 


Weighted average under third Condition :— 
DXW 700 A 
Wazc———— z————-—46.67 Units 
zw 15 
(Simple arithmetic average is less than weighted arithmetic 
average 40 246.67) 


Advantages of Arithmetic Average. Arithmetic average is 
the most commonly used average. It is widely understood, 
and an average man is more familiar with this average than 
median or mode. There are certain advantages of this average 
due to them it is more familiar. 


Advantages 


i—Arithmetie average is simple to compute and the best 
understood average. An elementary knowledge of addition, 
multiplication and division is sufficient for calculating an 
arithmetic average. 

2—An arithmetic average is an exact or computed figure 
and it is suitable for further mathematical treatment. 

3. In the calculation of arithmetic average every item in 
the series is included. Hence this is the most representative 
average. 

4—The computation of arithmetie average does not require 
the arrangement or grouping of items as is done in the 
computation of median. 

5—Arithmetic average is a good basis of comparison. 
For example, when we want to compare the marks of students 


184 AN INTRODUCTION TO MODERN STATISTICS 


of the same class in two colleges, the arithmetic averages of 
the marks in the colleges form the correct basis for assessing 
the relative efficiency of the colleges. 


6—It is possible to calculate the arithmetic average even 
if some of the details of the data are lacking. If aggregate of 
items is given and total number of items is known, arithmetic 
average can be found out. 


7—The arithmetic mean, depending for its calculation on 
all the observations has doubtless the main advantage that it is 
least affected by fluctuations in sampling, that is to say, if in 
the course of observations a few unbiassed errors in measure- 
ments happen, those errors which are usually both positive and 
negative and of small relative magnitude, balance one another, 
thereby ensuring stability to the average. 

8—The arithmetic average is amenable .to further 
manipulations. If means of two or more groups are given and 


number of items in each group is given, we can find out an 
overall arithmetic average. 


. Disadvantages 


i—Extreme items have a disproportionate effect on the 
mean and reduce its usefulness as a summary of the whole. 

2—Sometimes arithmetic mean may not be an actual item 
in the series, and as such it is called a fictitious average. 

9—An arithmetic average may give wrong impression in 


the absence of complete data. For example, profits of two 
businesses are as given below. 


Profit for Business I Business IT 
1957 15,000 40,000 
1958 20,000 25,000 
1959 25,000 20,000 
1960 40,000 15,000 

1,00,000 1,00,000 


The means of the profits of these two businesses are the 
same i.e. Rs. 25000, but the figures show that business I is 
Progressive and Business II is declining. 
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4—Arithmetie mean cannot be computed by simply observing 
the series, 

5—For the computation of mean, it is essential to know 
the actual values of all the items. 
Properties :— и x 

(1) The arithmetic average multiplied ыы the number o; 
values in the distribution gives their aggrega e. A к 

(2) The sum of the deviations from the arithmetic averag 
is equal to ‘Zero’, S 

m The sum of the squares of deviations from E 

mean is less than o£ those computed from any other point. 


MEDIAN 


Median is the value of the middle item of a peri 
the items have been arranged according to their values. ual 
divides the distribution into two equal parts so m E буз 
number of values Не on either side of it. а ries 
and Chaudhury, “Median is the value of that item in а quer 
Which divides the series into two equal parts, one part nir 
of all values less, and the other all values greater than it. E 

Calculation of Median. (Individual мане 
ting median of an individual series, it is essential ер рф 
із arranged in either ascending or descending order. 
median is caleulated as: 


N-r1 > 
М Size of the (CS) th item 
Mlustration—14 


Determine the Median from the following APER : 
25, 15, 23, 40, 27, 25, 28, 25, 20. 
The series is arranged :— 
15 
20 
23 
28 М 
25 ——— 5th item Median 
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M or 5th item 


_ М 943 _ 10 
рти к Али: 
Hence Median—25. 


Sometimes it so happens that the value of the median 
comes in the fraction. In the above example, had there been 


10--1 г Е 
10 items, median would have been m —b.bth item. Then it 
would be calculated as 
M- Siza of the 5th item-|-Size of the 6th item 
= 2 By 


That is an average is taken out of 5th and 6th items. This 
gives the value of the Median. 


Illustration—15 


Locate Median in the following data :— 
15, 31, 30, 31, 27, 23 


arranged series—15, 


23 
21 2 5 
30 ——> Median 3.5 th item 
31 
81 
N= 6 
N-+1 
М= 
2 
6+1 7 
27-30 57 
МЕК шыл dS ADI 
2 2 


Discrete Series, There is no need of arranging the series 
in either ascending or descending form. The reason is in this 
case, we have to find out the cumulative frequency—which 
automatically places the series in an ascending order. The 
following steps are taken while locating median in а discrete 
series. 
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(1) Cumulative frequency is known. 
(2) Value of fies th item is known, 
(3) Median is located at the size of the item in whose 


N 
cumulative frequency this value of ( узв ) th item falls. 


Illustration—16 


Compute the Median of the following series :— 


Size of the item Frequency 

2 3 

3 8 

4 10 

5 12 

6 16 

7 14 

8 10 

9 8 

10 17 

11 5 

12 4 

13 1 

CN xr ETT 

m f cf (Cumulative Frequency) 

Ic Vna adi T Е 
2 3 
3 8 11 
E] 10 21 
5 12 33 
6 16 49 
7 14 63 < Median item falls in this group 
8 10 78 
9 8 81 
10 17 98 
11 5 108 
12 107 
18 1 108 


_————— 


10841 109 
M= I d ehe e еш сына. 
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The 54.58 item falls in cf 63 group. This is in the size of 
the item 7. Hence Median—7. 


Continuous Series. In a continuous series, 


(1) Cumulative frequency is calculated. 


N+1 


(2) M is calculated, th item and on this basis 


the median group is located. 


(3) Then Median is calculated by the following formula— 


l—h 
RA moe 
f ( ) 
wherecl,—lower limit of the median group 
L—upper limit of the median group 
f—frequency of the median group 
№1 


m= 5 th item 


M=1,+- 


c=cumulative frequency of the preceeding group of 
the median group. 


Illustration —17 


Caleulate the median rent :— 


Rent 

(in rupees) f 
20— 30 16 
30— 40 24 
40— 50 59 
50— 60 100 
60— 70 41 
70— 80 31 
80— 90 19 
90—100 10 


Total 300. 
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Class interval f cf 


20— 30 16 16 
30— 40 24 40 
40— 50 59. 99 


50— 60 100 199 <—— Median lies in this group 
60— 70 41 240 


70— 80 81 271 


80— 90 19 290 

90—100 10 300 

Ape CUL, eee ee MENS 
2 2 2 


Hence Median lies in the group 50—60. 


By applying formula— 


Mh4- E (m—c) 
60—50 
= .5—99 
50+ ко (150.5 ) 
10 
—50-LL— X515 
80100 < E 
—50-L5.15 
—55.15 


Basis of the formula of calculating median. The above 
formula is based upon the assumption that items belonging to 
each class interval are equally distributed throughout the 
class interval. In the above example median falls in the class 
interval 50—60, the frequeney of that class is 100. It is 
assumed that there 100 items are equally distributed in that 
class. It follows that with 1 size, there are 10 items. Median 
is 150.5. 99 items are distributed upto the previous class 
40—50. Hence we have to advance by 150.5—99—51.5 items. 
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10 items are spread over in 1 size hence 51.5 items are spread 
over in 5.15 items. This added to the lower limit gives the 
value of the median. 


Illustration —18 


Amend the following table and locate the median from the 
amended table— 


Size Frequency 
10 —16 Төзү) 
16 — 175 15 added 
17.5—20 17 j 
22 —80 25 
30 —35 28 
нту i5 | added 
45 —up 40 
(Alld. B. Com.) 
ESL ERI TUE Е SHIRE NERA GER Yi 7 
Amended size F C.F. 
10—20 42 42 
20—80 25 67 
80—40 58 125 
40—50 40 165 
_ МЕ 16541 
т —83—Hence it lies in the class 30—40 
Lb—l 
м= 2 ; 1 (те) 
40—30 
=80 88— 
+g — (83—67) 
10 
=80+ __1 
ds БЕ х16 


—890--2.176—32.76 


Ilustration—19 
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Calculate the median for the following distribution :— 


m f cf. 
100—104 4 4 
105—109 14 18 
110—114 60 78 
115—119 188 216 
120—124 206 422 
125—129 298 720 
180—134 380 1100 
135—139 450 1550 
140—144 500 2050 
145—149 430 2480 
150—154 260 2740 
155—159 128 2868 
160—164 66 2934 
165—169 28 2962 
170—174 12 2974 

Total 2974 


2—1 


Mh 


—134.54- 


—138.8 


f 


139.5—134.5 


—————————— 


2 


(m—e)- 


450 


The class interval will be 
99.5—104.5 
104.5—109.5 
109.5—114.5 
114.5—119.5 
119.5—124.5 
124.5—129.5 
129.5—134.5 
184.5—139.5 
139.5—144.5 
144.5—149.5 
149.5— 154.5 
154.5—159.5 
159.5—164.5 
164.5—169.5 


169.5—174.5 


(I. A. & A. 5.) 


M= aird = ker = Ed —1487.5 th item 


(1487.5—1100) 
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Illustration —20 


The following table gives the marks obtained by 65 students 
in Statisties in a certain examination :— 


Exam. Marks No. of Students 
More than 70% 7 
(ss 60% 18 
> 50% 40 
» 40% 40 
» 80%, 68 
5 20% 65 
Caleulate the median 
I way II way 
— '—— € MÀ— r——— ——  Ó(— Ht 
m f cf m f сї 
20—380% 2 2 above 70% 7 i 
30—40% 28 | 25 60—70°/, 11 18 
40—50% 0 25 50—60°/, 22 40 
50—60% 22 47 40—50% 0 40 
60—705/, 11 58 30—40%, 28 68 
70% andabove | 7 | 65 20—30%, 2 65 
| 
ce SASS С snp ES SS И М, 
N+1 65-41 66 
Мес нк Fiat Medus 
5 2 rn =33 
I II 
1—1, 5—1, 
М= + my (m—c) M=), + p m-o* 
60—50 60—50 
= 320882 = e Е 
50-+ zg (38—25) =50+ = {эз (65 40) 
50 i 8 ap 
=50+ 55 X =50+ „х8 
—50--3.6 —50+3.6 


—53.6% marks 


*o—difference 
frequency of the 


—53.6% marks 


between the total frequency and cumulative 


median group. 


descending order, this change is essential. 


If items are arranged in 
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Illustration —21 


Dealing in a certain security at the following prices took 
place on the Bombay Stock Exchange. Calculate the Median 
price. 

1005/16 100%, 10014, 1005/16, 1003/6, 100%, 10036, 1009/32 
100/32, 1001/16, 100%, 100, 99%, 999/52, 9911/32, 9934, 9914 
(B. Com. B.H.U.) 

Notes :—In the above question prices are given in fraction. 
Hence it will be a bit difficult to arrange them in descending 
order. It is therefore suggested to find out L.C.M of donomi- 
nators and then Numerators should be brought to one level— 
L.C.M will be 32. 


10, 12, 8, 10, 6, 8, 12, 9, 11, 2, 4, 0, 28, 9, 12, 8, 11 
32 
Keeping in mind that first 11 items have 100 as complete 


number and rest 5 items 99 as complete number, the series will 
be arranged. 


Item No. Price Item No. Price 
1 993 9 10045; 
2 959; 10 1002 
8 9011 11 1002 
4 998 12 1005% - 
5 99% 13 10045 
6 100 14 100455; 
7 10075 15 10034 
8 100% 16 1008 

17 1008 
The Median item= ЕТ = = —9 th 


The value of 9 th item is 1008/1, 
Hence М=1003/1в 
Other Measures on the Principle of Median. The median 
item occupies a midway position in the distribution. There are 
Several other measures of control tendency which are used in 
certain statistical] measurements. "They are allied to the median 
in that they are based upon their position in a series. 
These measures are— 
(1) Quartiles —If a distribution is divided into four equal 
parts then there will be three quartiles—Q,, 
Qo» Өз: Q is the median itself. Hence Q, 
and Q, may be determined. Q, is also called 
lower quartile and Q as upper quartile. 
13 
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(2) Quintiles —If a series is divided into five equal parts, 
then there will be 4 Quintiles. Qui, Qus, Qua, 
Quy. 

(8) Deciles —If a series is divided into ten equal parts, then 
there will be 9 Deciles, the fifth decile will be 
median itself. The deciles are noted as 
Diy Же Е 

(4) Percentiles—If a distribution is divided into hundred equal 
parts then there will be 99 Percentiles. They 
are noted as P,, Ps, Рз and so on. 

(5) Octiles— If a distribution is divided into 8 equal parts 
then there will be 7 Octiles known as О;, О» 


These different measures are caleulated just as we calculate 
median. 


Quartiles 
REIN 
a= (= jm item 
q,— СЕО th item 
Quintiles 
| 2(N--1 h 
Qu= Aum item, Qu— LED th item, 
3(N+1) n. _ A(N+1 
qu— th item, Qu,— an leat. 
Deciles 
ROULANT 4(N--1) 
D, ———th item, жога НА 
зло б o e 
9(N-++1) р 
D,— ec item and other are also found out like this. 
Percentiles 
Nee La 40(N-+-1) 
Ру th ip pnr c i 
1 100 item, Ру 100 th item, 
90(N-++1) > 
Po= ——т007- th item and other are also found out like this. 
Octiles 
3(N--1 
0,— ы th item, O= е th item, 


№1 
= Bion item and others are also found out like this. 


STATISTICAL AVERAGES 


Individual Series 


Illustration—22 


195 


The attendance of outdoor patients for 59 days in Madho 


Dispensary is as under. 


Find out, Median Qurtiles, 2nd 


Quintile, 3rd and 5th Octiles, 2nd, 3rd and 7th Deciles, 35th 
and 95th Percentiles :— 


105, 92, 87, 115, 124, "180, 95, 
Yir: ЕИБ тво, 180,[:417Д 87, 
108, 114 141, 189, 96, 89, 104, 
SO IU 11675 419, 8 129509 114; 198; “өр, 
187- 9384, 98, 1105, 108,1, 86.5 101, 
100, 130, 112, 129, 184, 148, 149, 
9172 341^ $107,462). 1192 E191 86150; 
107,5 118, 115, . 92, .181,  184,. 108, 
с _———————— 
5. №.| Item | S.No.| Item | S.No.| Item | S.No. 
1 36 16 101 31 114 46 
2 77 17 103 82 114 47 
8 87 18 104 88 115 48 
& 87 19 105 84 115 49 
5 89 20 105 35 116 50 
6 89 21 107 86 117 51 
№ 91 22 107 37 120 52 
8 92 23 108 88 121 58 
9 92 24 108 39 122 54 
10 98 25 112 40 128 ББ 
11 95 | 26 | 112 | 41 | 124 | 56 
12 o5 | ov | 112 | 42 | 128 | 57 
18 06.41 28 | 118, | 48 | 129 1. 58 
14 99 | 29 | 118 | 44 | 129 | 59 
15 | 100 | 80 | 118 | 45 | 180 


196 AN INTRODUCTION TO MODERN STATISTICS 


З N-L1 
Median— pene item— каш th item 


The value of 30 th item 18—118, 


N+1 594-1 
Q = dia e =15 th item е value of which—100 
3(N--1) 3(59-L1 
ee i салат th item, the value of which—130 
2(N+1) 2(59+1 
Qu, = aana E чо —24 th item, the value of which—108 
3(N-L1) 3(59 
Oct,— A ai a th item 
22nd item-L23rd item 107-108 
ed + leues 
2 2 
5(N-L1) 5659 
Ос : ое : аз 1) _375 th item 
37th item438th item 120-121 
i a =120.5 
2 2 
2(N-+1) 2(59+1 
Dy = ‹ a А ‹ ыс ) 12th item— 95 
3094-1) 3(594-1 
ie is Po um —18 th item—104 
1(59 
Di se put -THD за item—128 
3 1)  35(59-.1 
Pag = isa ) = > ) —21 st item—107 
Pg = 95(N-D — 950691) ST th item=148. 


TOO 12) 415 100 


Discrete Series 


Illustration—23 
Calculate the Median, Qu Qs; D4, Dz, Os, От, Pis; Ро» Poo 
from the following :— 


Earnings 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 


Employees 3 6 10 15 24 42 75 90 79 55 36 26 19 18 8 
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M F ed. 
9 8 8 
10 6 9 
11 10 19 
12 15 84 
18 24 58 
14 42 100 
15 75 175 
16 90 265 
17 79 844 
18 55 399 
19 56 485 
20 26 461 
21 19 480 
22 13 493 
23 8 501 
N41 50141 502 
Median— ы Ш: a =—> =251 th item—16 
о oe I S е =125.5 th item—15 
6, 52 Est dk e lo 1906 5765 th item—18 
iu es uL ре Ca th item=16 
D; 8 E m red ен. —3514 th item—18 
0 
o, “ш ЗОО л Sh 88 28 ае 
EU 8 8 
о, = ME d eas: —439.25 th item—20 
8 
p, СТЫ О во 65 Тыс 
100 100 7100 
Ро = COCR Q^ № мешт 
100 100 
p, — МО ЗООР) ер th item—20. 


100 100 
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Continuous Series 
Illustration —24 


From the following distribution calculate the Median, 9;, Өз, 
8th Decile, and 56th Percentile. 


M. Frequency 
1—8 6 
8 — 5 58 
6 — T 85 
7—9 56 
9 —11 21 
11—18 16 
18—15 4 
15—17 4 
Е c.f. 
gem 6 6 
8—5 53 59 
5—7 85 144 
7—9 56 200 
9 —11 21 221 
11—13 16 237 
18—15 4 241 
15—17 4 245 
Median 1 был. ue —123 rd item which falls in the 
group 5—7 


1,—1 Б 
M—hET (mo) —54- DL (1238-59) 


2 
=5} — x64 =65 
"ag 


мр1 24 246 б 
Q= i: = um E. =61.5 th item. It falls in the 


Q,—h4- г (Qi—c) 

вр 75 (es go 

=54 -gg (615—859) 

=5+ 52 хав —5.06 

_ 8(N41) _ 3245-0138 
Q= Ире AE Fa 
the group 7—9. 


=184.5th item. It falls in 
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mee 


f (qg—c) 


=7 9208 (184.5—144 
=I .5—144) 


2 
=T- 40.5 =8. 
+5 х 8.45 


8(N-+-1) 8(245--1) 1968 


= == —— — 196.8 th i i 
g 10 10 19 item. It falls in 
the group 7—9. 

lj—h 
Dgs=h-+ Dio 


= 
= уты (196.8—144) 


=7+ > x528 =8.9 


—137.16 th item 


p _ 560-1) _ 56(245-+1) 
ies ЕО 


100 
which falls in the group 5—7 


Pgg— Lit (Рв—с) 


Lb 
f 


1—5 

=5-- ^ (187.76—59 
dn 85 (187.7 ) 

=54 2 1816 =6.85 

=5-+- pg X816 080, 


Graphic Location of Median. Median сап also be located 
graphically. There are two methods of locating median with 
the help of graphs. 


(A) By an ogive—An ogive {в a cumulative frequency 
curve, The curve shows a rising trend. The middle item is 
marked out on the vertical scale and a line parallel to the base 
is drawn cutting the ogive at any point. From that point a 
line perpendicular to the base is dropped and the value of 
median is found out. Similarly Quartiles etc. are also located. 
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Illustration —25 


Frequency distribution of marks obtained by a class of — 


Students shows the following :— 


Marks Е б. 
0 — 80 10 10 
30— 40 15 25 
40— 50 30 55 
50— 60 82 87 
60— 70 8 95 
70—100 5 100 


(a) Find the median by drawing the ogive curve. 
(b) Check up the value of the median. 


10 
x N41 10041 v. FUE кышы 


m мы i 


—25.25 


2 2 


50—40 
—40 25.25—2 
| = gg (25:25—25) 
50—40 STER 
M= o— = — " = . 
40-- 80 (50.5—25) 404 30 x .25=40.08 
q— XN--D _ зао 
Sia at СА 
=75.75 
—48.5 60—50 
=50-- —> — (15.15—55) 


=56.3 


ОСТУЕ CURVE 
(Locating the Median) 


No of Students 


Measurement 
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Median can also be located by drawing ‘ogive less than" 
and 'ogive greater than’ curves. The intersection point of 
these two ogives locates the Median. In the above example the 
median can also be located thus— 


M F c.f. (Ascending) c.f. (Descending) 
0 — 30 10 10 100 
80— 40 15 25 90 
40— 50 80 55 75 
50— 60 82 87 45 
60— 70 8 95 18 
70—100 5 100 5 


OGIVE CURVES 
100 


(HE More than ogive EE Less than ogive 
т 


ЕНШЕ БЕ: 
No of students? 
НЕНЕН 
EN 
S 


20 


EH 


iii 20, 485 2960: 100 


Hn Measurements! 


(2) Galton’s Method of locating Median, Galton has: 
developed another graphie method of locating median. According 
to his method every measurement, points corresponding the 
frequency of the measurement are plotted in such a way that 
the last point of the previous measurement forms the base for 
the points of the measurements under study. A distance 
between each point corresponds to the distance marked on the: 
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vertical scale for each point. A line is drawn in such a way 
that it passes through the centre of each set of these points. 
Median is then located in the same way as in the ogive (1) 
Shown above. 


Illustration —26 


The following table shows the marks obtained by a batch 
of 23 М. Com. students in Statisties out of 100. 


60 | 70 


4 


50 


20 


1 


Marks—| 10 80 | 40 


2| 2 


Frequency—| 2 3 


Determine the Median by Galton's method— 


MEDIAN BY GALTON'S METHOD 


25 
25 
20 
20 
215 2 
8 58 
б БЫК DO eee E 
510 `$ 
2 102 
5 5 
0 0 
10 _ 20 30 40 50 60 70 80 90 


Marks 


Illustration—27 


Put the following information in the form of a frequency 
distribution and make an estimate of the mean wage. 


In a certain group of wage earners the Median and Quartiles 
wages were Rs. 37, Rs. 29.5 and Rs. 40.5 per week respectively. 
6% of the workers got less than Rs. 20 per week, while 8% 
got Rs. 45 and over per week. 
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Note—Let us assume the range of Rs. 10—50. Then the 
frequency distribution will be as under. 


Wages Percentage of workers 
(m) m.v. (£) mf. 

10 —20 15 6% 90.00 
20 —29.5 24.75 19% 470.25 
29.5—87 33.25 25% 831.25 
37 —40.5 38.75 25% 968.75 
40.5—45 42.75 22%, 940.50 
45 —50. 47. 5 8% 142.50 

2—3443.25 

хушї 3443.25 
ME ИР 


34.43 
The Mean wage is Rs. 34.43. E 


Advantages of Median 

1—Median is very easy to caleulate and is readily under- 
stood, specially in the series of individual observation and a 
discrete series. 

2. It is not affected by items on the extreme. It is 
independent of the range of the series or the spread of values 
above or below it. 

3—Unlike arithmetic average, the median may be determined 
where the data are incomplete e.g. irregular class-intervals and 
open-ended final classes. 

4— Median is amenable to further algebraic process ; it is 
used in the caleulation of mean deviation. 

5—Provided the number of frequencies or items in ап 
ungrouped series is uneven, the Median will actually be one of 
the series as it will be if a grouped distribution contains an 
even number of frequencies. Otherwise the Median is а derived 
figure. In contrast, the Mean seldom conforms to any 
individual item. 

6— Median is helpful in caleulating two central values where 
items are not capable of precise quantitative studies like 


intelligence, honesty ete. 
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7—Under the circumstances, it will be appreciated that the 
median is best used when the series is continuous or where а. 
discrete series may be treated as continuous. Where there is: 
a tendency for the frequencies to cluster evenly around the 
middle of the series, rather than dispersing themselves: 
unevenly throughout with clustors around the maximum and 
minimum values, the median is reliable. 


Disadvantages 


1—The computation of Median requires іп certain cases: 
arraying of the items. It is very often a cumbersome job. 


2—It gives little basis for further calculations. It is not 
possible to find the total salary if median salary and the number 
of workers is given. 


3—It is subject to chance variation and thus is an unstable 
measure of central tendency. 


4—1{ is not possible to obtain the actual median in the 
case of a group having an even number of observations and 
thus in such a case it is a makesheft, an average of the two 
items in the middle is taken. 


5—A very little importance is attached to items on the 
extremes and as such the median fails to register changes due 
to the changes in the values of the items on the extremes. 


Properties 
1—It is an average of position. 


2—The sum of the deviations about the median, signs 
ignored, will be less than the sum from any other point. 


MODE 


Statements such as 'the average man prefers this brand 
of cigarettes’ or that, “the average woman uses cosmetics’ are 
very often heard. Used in this content the term 'average" 
means the majority and not the arithmetic mean or Median. 
Mode is the value of the item of that item in a variable which 
occurs most frequently. According to Craxton and Cowden, 
“The mode of a distribution is the value at the point around 
which the items tend to be most heavily concentrated. It may . 
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be regarded as the most typical of a series of values.” Zizek 
difines mode as, “the value occuring most frequentiy in a series 
(or group) of items and around which the other items are 
distributed most densely." The word ‘Mode’ has been derived 
from the French word 'la Mode' which means Fashion. Kenney 
defines, “the value of the variable which occurs most frequently 
in a distribution is called the mode." 


Calculation of Mode 

Individual. Series. 

Ilustration—28 
Determine the Mode from the following figures :— 
25, 15, 23, 40, 27, 25, 23, 25, 20. 


In order to find out mode we have to convert it into discrete 
series— 


M. F 
15 1 
20 1 
28 2 
25 8 
27 1 
40 1 

9 


Item 25 occurs the largest number of times hence it is mode. 


Discrete Series. If there is regularity and homogenity in 
the series, mode can be located by inspection of the series alone. 
If there is doubt then it is advisable to adopt the process of 
Grouping. Grouping method is illustrated in the following 
example. 


Illustration—29 
Compute the Mode 


Size of the item Мо е Oba SO MET. 12 18 
Frequency 8 8 10 12 16 14 10. 8 17 5 4 1 


1 Craxton and Cowden—Applied General Statistics p. 189. 
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А 
Sizeof Frequency М 


Жеш 9-5 (E) (Жул (HE) «(19 0) ^ (VD 
rp үт 

5 Жүй эзы Pas ET 

7 T Jm вр qe] on] 38 
5 E } 18 bu AM } 40 | 82 
Pod dedeitbebe 
NE MA 


The grouping process is done by— 


(1) In the first column mode is located by inspection ie. 
item containing maximum frequency. 

(2) In the second column we group frequencies in two's, 
starting from the top. Their totals are found and item 
containing maximum frequency located. 

(3) In the third column we do like column (2) except that 
we start grouping by leaving the first frequency. 

(4) We group in this column frequencies in theree's— 
starting from the first frequency. Totals calculated and the 
item containing maximum frequency located. 

(5) Same as in column (4), starting grouping in three’s 
from the second frequency. 

(6) Same as in column (4), starting grouping in three’s 
from the third frequency. 


Analysis. In order to find out the size of the item which 
repeats largest number of times we analyse the grouping. This 
is done like this. 

Size 
Соттоп «2258 4X5. ПО 7 Вл ЗОО. 18 


1 1 
2 1 1 

8 1 1 

4 рт 1 

5 абу а с 

б X2 ST 1 


Total [Ue (us Ca 8 5 8 1 0:7 4509507 O 
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Hence Mode is located at size & (frequency 16). 


Continuous Series. The computation of mode in а 
continuous series presents a peculiar problem. After grouping 
class interval is found out having largest number of frequencies. 
and then mode is calculated by the following formula. 


f,—f, 
6—1 
28, —fo— f2 (8—20 


Z—h4- 


Where— 
Z—Mode 
1 —the lower limit of the modal class 
l—the upper limit of the modal class 
fy—frequency of the class preceeding the modal class 
f,—frequency of the modal class 
f,—frequency of the class succeeding the Modal class. 


Illustration—30 
Find the mode from the following table. 


Marks Students 
0—10 i 2 
10—230 18 
20—30 30 
30—40 45 
40—50 85 
50—60 20 
60—70 6 
70—80 3 


In the above series it is quite apparent that mode lies in 
the group 80—40. Hence. 

f,— 

== Е 1—1, 

Z=1,+ RI = fa 1) 

45—30 

aul as (402-80 

= 80+ 55 15—30—85 (40—30) 


= 8045-10 


—30-L-6 
=86. 
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Illustration—$1 
Compute the mode of the following distribution. 
Size of item Frequency 
4— 8 10 
8—12 12 
12—16 16 
16—20 14 
20—24 10 
24—28 8 
28—32 £7. 
32—86 5 
86—40 4 
Size F1 2 3 4 5 6 
4— 8 10 
8—12 12 |2 | эв | 38 
19—16 16 | 30 42 
16—20 14 | 24 40 
20—24 10 | es | 32 ] 
24—98 8 | 25 
28—32 T7 | 22 35 
32—36 5" 26 30 
36—40 4 | : \ 
‘Analysis 
Column Size 4—8 8—12 12—16 16—20 20—24 24—98 28—32 
1 1 
2 1 1 
3 1 1 
4 1 1 1 
5 1 1 F 
6 1 T 1 
Total 1 3 5 3 1 0 1 
Mode lies in the group 12—16 
1—6 
7—1 e Б у ке) 
күс Ul) 
12 Е RUN 
вм оос 
4 
=12+ — х4 
6 
=124-2.67 


=14.67. 
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Illustration—32 
Find the modal wage from the following data. 
Wage No. of Earners 
12.5—17.5 4 
17.5—22.5 44 
22.5—27.5 88 
27.5—32.5 28 
82.5—37.5 6 
87.5—42.5 8 
42.5—47.5 12 
47.5—52.5 2 
52.5—57.5 2 
т. Е 1 2 8 4 5 6 
ба ee АЕ ЕЕ кт 


12.5—17.5 4 
17.5—22.5 44 


22.5—27.5 38 | 66 
'27.5—32.5 28 34 


32.5—37.5 6 
37.5—42.5 8 
42.5—47.5 12 14 26 
47.5—52.5 2 } 4 16 22 
52.5—57.5 2 
Analysis Table 
Class 12.5—17.5 17.5—22.5 22.5—27.5 27.5—32.5 82.5—37.5 
Column 
1 1 
2 1 1 
3 1 1 
4 1 T 1 
5 1 1 1 
6 1 1 1 
"Total 1 4 5 8 1 
Mode comes in the group 22.5—27.5 
&— 
= н: Ра 
сата pd 
38—44 
e 21.5— 22.5) 
=22.5- V 88 44—08 ‘ 
—6 
—225- — X5 
—22.5—1.5 
=15 i 


14 
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The mode according to the above grouping must fall in the 
class interval 22.5—27.5. It has shifted to a lower class. 
Hence mode is ill-defined. We should regroup the sizes in twos 


as follows in order to find out mode. 


M 


12.5—22.5 
22.5—32.5 
82.5—42.5 
42.5—952.5 es 
52.5—62.5 E 


Thus mode lies in the group 22.5—32.5 
£,—fo 
2f,—fy—f» 
66—48 
23£66—48—14 


Zh 015—1) 


=22.5-|- 


=22.54 i x10 


—22.5-1-2.57 
=25.07. 


of mode is found. 


Illustration—33 


Determine the value of mode of the following distribution 


graphically and verify the result— 


Marks 
0— 5 
5—10 

10—15 

15—20 

20—25 

95—80 

80—85 

35—40 

40—45 

45-—50 


(32.5—22.5) 


15 


Graphic location of Mode. Mode can also be located 


graphically. Mode is found in a frequency polygon or a 
frequency curve at the position of the appex. From that 


position a perpendicular is drawn on the x-axis and the value 


Students: 
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f,—f, 


2—1 le 
i toe et PE 
15—51 
=254 (80—25 
F 150—57—48 ‹ ) 
18 
=254 —— X5 
+ 25 x 
=27. 


DETERMINATION OF MODE 


3 


8 


№ of students 
No of Students 


t2 
о 


0 5 10 15 20 25 30 35 40 45 50 


| Marks 
Advantages of Mode 


1—This is an average having more practical use as compared 
to Mean or Median. It has greater value in scientific and 
commercial matters. 

2. The Mode has the great advantage that.as it is usually 
an actual value of an important part of the series, but not 
necessarily the major part. This assumes the modal value is 
apparent from simple observation of the distribution ie. an 
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obvious concentration of frequencies around a certain value. 
If the mode has been interpolated by formula or the mid-point 
of a large class interval used, then this statement does not hold. 

3—Like the Median, mode is unaffected by the dispersion 
of the series i.e. its distribution over the range. 

4—It is not affected by extreme items. It can be calculated 
even if extremes are not known. 

5—It is simple and precise. 

6—Very often it can be ascertained by mere inspection and 
is known as an inspection average. 


Disadvantages 
1—It does not lend itself to further mathematical treatment 


2—Unless the number of frequencies is reasonably large 
and the distribution reveals a marked tendency to group around 
a given value, the mode is not easy to determine. 


3—1 is unsuitable in cases where relative importance of 
items have to be considered. 


4—Choice of grouping has considerable influence on the 
value of the Mode. 


GEOMETRIC MEAN 


The geometric mean is derived by multiplying together all 
the value and then extracting the relevant root of the product 
of these values. The root to be calculated is the number of 
items in the series. Thus geometric mean is the т th root of 
the product of n numbers. The formula in natural numbers is 


G.M— */nxnxngxn..... n 
In order to facilitate calculation, logarithms are used in 
its calculation. Then formula becomes :— 


og n,-+Logny+Logng..... Log n 


GM—Antilog of b 
N 


Individual Series 


Illustration—3 4 { 


Caleulate G.M. of the following series— 
20, 58, 87, 130, 170, 250. 
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Items Logs 
20 1.8010 
58 1.7634 
87 1.9395 
180 2.1139 
170 2.2304 
250 2.3979 
Total 11.7461 
4. 11.7461 
ОМА — —Antilog of 1.95768 —90.7 
Illustration—35 
Calculate the G.M. of the following two series :— 
Series I | Series II 
8884 | 0.9842 
382 | 0.3154 
63 0.0252 
8 0.0068 
0.4 0.0200 
0.08 0.0002 
0.009 | 0.5444. 
0.0005 0.4010 
Logarithms А. Series B Logarithms B 
3884 3.5888 0.9842 [1.9930 
382 2.5821 0.8154 14983 
68 1.7918 0.0252 3.4014 
8 0.9031 0.0068 3.8325 
0.4. 1.6021 0.0200 8.3010 
0.08 2.4771 0.0002 4.3010 
0.000 [8.9542 0.5444 [1.7356 
0.0005 14.6990 0.4010 1.6031 
М=8 3—1.6057 М=8 $;=11.6659 
Geometrie Mean 
Series А. Series B 
р ZLog. 
GMA or LOB GMA Lote 
N N 
í 11.6659 
SAL of 1:605 MTS 
: 8 8 
E = сэ? 1686090 
— » 
—1.589 units inui 


= 9,7082 р 
=.05105 units 
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Illustration—36 


The monthly incomes of 10 families in rupees in a certain 
locality are given below. Calculate Geometric Mean :— 


85, 70, 15, 75, 500, 8, 45, 250, 40 and 36. 
(B. Com. Agra) 


Family Income Logarithms 
1 85 1.9294 
2 70 1.8451 
3 15 1.1761 
4 75 1.8751 
5 500 2.6990 
6 8 0.9081 
7i 45 1.6532 
8 250 2.8979 
9 40 1.6021 
10 36 1.5568 
N=10 3 17.6378 
G.M. A n УСУ п, 21... n 
= 19/855 70X15 X75 X500X8X45 X 250 X40 X36 
or in logs—A.L. of zlog —А. L. of ai iu 
—A. L. of 1'7637 
—Rs. 58°08 


Discrete Series. While caleulating G.M. of a discrete series, 
logarithms of different values should be found out and they 
should be multiplied with their respective frequencies. Their 
total should be divided by the total number of frequencies. The 
formula is 


Logx,Xf,--LogxoXfs-]-LogxsXfs . . . Logxjx1x. 
ВЕ... fx 


3 (Log xxf) 
xf 


G.M.—A. Log of 


= Antilog of 
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Illustration—37 
From the following data calculate the G.M. 
Size of item Frequency 
6 8 
7 12 
8 18 
9 26 
10 12 
11 12 
12 8 
Total 100 
cT ВЫ 
M Logarithms F LogXf 
T 
6 0.7782 | 8 6.2256 
7 0.8451 12 10.1412 
8 0.9031 18 16.2558 
9 0.9542 26 24.8092 
10 1.0000 16 16.0000 
11 1.0414 12 12.4968 
12 1.0792 8 8.6336 
100 294.5622 
Lo; 
G.M. = Antilog fo 
94°5622 
= Antil f 
а 100 
— Antilog of *9456 
— 8°822 
Illustration—38 


The following table gives the marks obtained by 65 students 
in Statistics in M.A. (Econ.) examination :— 
No. of students 


Marks 


More than 70 


60 
50 
40 
30 
20 


T 
18 
40 
40 
63 
65 


Caleulate the G.M. of the above series. 
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Marks M.V. F Log МУ. | Log МХЕ ^ 
20— 30 25 2 1.3979 2.7958 
80— 40 35 28 1.5441 85.5148 
40— 50 45 0 1.6532 0.0000 
50— 60 | 55 22 1.7404 38.2888 
60— 70 208 11 1.8129 19.9419 
70—100 | 75 7 1.8751 13.1257 

Total 65 ~ 109.6665 
GM = Anbos 0100051 
xf 
Ап or 02:00865 
65 
= Ап оз of 1°6872 | E 
= 48°64 marks. 


Weighted Geometric Mean 


In order to calculate weighted geometric mean the formula: 
is modified as given below :— 


G.M.w = Antilog of 


Logx, X w+ LogxsX We-Logx; ws . . . Log nx wx 
Wi+Wo-+W; ... WX 
L 
бейт, oc IDEO 
zw 
Illustration—39 


Calculate the G.M. of the following weighted frequency: 
distribution— 


Indices Weights 
110 4 
125 1 

92 3 
100 10 
160 5 


84 2 t 8 
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How does it differ from the unweighted geometric mean ? 


Indices | Weights | Logs of Indices | Log X<w 
110 4 | 2.0414 8.1656 
125 1 2.0969 2.0969 
92 3 1.9638 5.8914 
100 | 10 2.0000 20.0000 
160 | 5 2.2041 11.0205 
84 | 8 1.9243 15.3944 
Total | 81 | 12.2305 62.5688 — 
Lo , 
GMiw= Antiog or 150€ EXP 
zw 
— Antilog of 6а 5888, 
31 
= Antilog of 2*0183 
== 104*3 units 
L 
G.M. = Antilgg of 2.98 Х 
zw 


ипо of ВО 


— Antilog of 2*0384 
= 109°2 units 
Geometric mean provides а satisfactory measure of 
computing rate of growth of economic phenomena. Rate of 
growth can be calculated as shown in the following example :— 
\ 


Illustration—39A 
Profits of a firm for different years are given below. Find’ 
out rate of growth :— 
Year Profits in lakh Rs. Ratios Log. 
1950 88.2 
1951 138.9 1389 i yy 
1952 174.2 88.2 0.1959 
1958 201.5 1.25 0.0969 
1954 189.5 1d6.- 0.0645 
1955 218.6 94 1.9731 
1956 288.4 1.15 0.0607 
1957 254.0 1.07 0.0294 
1.09 0.0374 
38.23 .4579 


a=1.176 G.M. 1.163 
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If the above data are shown according to these two rates of 
growth, the figures would be 


Growth Estimates 


Еше 


Үеаг Actual According to According to 
arithmetic rate geometric rate 
1950 88.2 88.2 88.2 
1951 188.9 103.7 102.6 
1952 174.2 122.6 119.2 
1953 201.5 143.5 138.6 
1954 189.5 168.8 161.2 
1955 218.6 198.5 187.5 
1956 233.4 288.4 218.1 
1957 254.0 274.5 258.6 
КУРЫ TX e 
Mean of rates of growth — 1.176 1.163 


Advantages of Geometric Mean 

i— Geometrie mean is the most appropriate average for 
measuring the ratios of change. 

2— t is an average most suitable when large weight has to 
be given to small items and small weights to large items, which 
"we usually come across in the study of social and economic 
phenomena. 

3—As it is less affected by extremes, it is more typical 
average than the arithmetic average. 

4—It is capable of further algebraic treatment. 
‘Disadvantages | 

1—Сеотеїтїс mean cannot be calculated if the size of any 
of the variable is either ‘0’ or is in negative. In such a case 
G.M. will be either ‘0’ or will lead to a figure difficult to interpret. 

2—Its computation is rather difficult as it requires the 
‘knowledge of logarithms. Hence it is less popular. 

8—In normal and moderately asymmetrical distributions, 
the G.M. cannot typify the frequency apart from the difficulty 
of locating it. 


Properties 

1— G.M. is less than arithmetic average. 

2. It has the property that taking its nth power (that is 
-multiphying together п numbers all equal to the G.M.) gives the 
same result as multiplying together all of the original 
observations. 
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HARMONIC MEAN 


The Harmonic mean is a type of average capable of applica- 
tion only within a restricted field. The harmonic mean of a series 
of numbers is the reciprocal of the arithmetic mean of the 
reciprocals of the individual members. The computation of the 
harmonie mean of a series is greatly facilitated by the use of 
prepared table of reciprocals. The harmonic mean is less than 
the geometric mean of the same observations. It is useful when 
the observations are expressed inversely to what is required in 
the average, for example, when the average hours per mile is 
required but the data shows mile per hour. The formula for 
computing harmonic mean is :— 


IM eh М 
1 1 
: 4 5 dc ULT ^ 
Or 
1 1 1. 1 1 
AE cU EN d ED ^ 
H.M. = Reciprocal N 


Individual Series 


Illustration —40 


The monthly incomes of ten families in rupees in a certain 
locality are given below. Calculate the Harmonic Mean :— 


85, 70, 10, 75, 500, 8, 42, 250, 40 and 36. 


(B. Com. Agra) 
К РМР Флеш шы шышы 
Family Income in Rs. Reciprocals 

1 85 0.01176 

2 70 0.01429 

8 10 0.10000 

4 75 0.01333 

5 500 0.00200 

6 8 0.12500 

7 42 0.02381 

8 250 0.00400 

9 40 0.02500 

10 36 0.02778 
оо’ э з у у. 

N=10 : X 0.34697 


= л e uini v eT 
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1 1 1 
= E - 16 < ad EH T 
Н.М. = Reciprocal of N - 
Or = Reciprocal of 
1 $ 1 1 1 1 1 1 it 1 
stoto ^u +500 8 +42 195040 536 
i PETS = id 
— Reciprocal of эү 
= Reciprocal of °034697 
=Rs. 28°82 
By the Second Formula— 
HM. = N 
1 1 т 1 1 
+i т s АСАЛ 
10 
107534691. 
— Rs. 28°82 


Illustration —41 


An aeroplane flies around a square whose side is 100, miles: 
long, taking the first side @ 100 miles per hour, the second side: 
@ 200 miles per hour, the third side (9 300. miles per hour and. 
the fourth side (9 400 miles per hour. What is the average 
speed of the aeroplane ? Test the validity of your answer.. 


e suerte T 
excisa rel + AN " jm 
H.M.—Reciprocal of N 
1 1 1 dise" 
io? 300 + 500 + 300 + 400 
S 4i 4 
at .01-+4-.005-+-.0033-+ .0025, 
= ^ 2. к 
.0208 
— „ — he ый 
= | 0052 


—192 m. р. В. 


| 
$ 
| 
| 
[ 
: 
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Instead of taking H.M. if we calculate Arithmetic mean it 
would give wrong conclusion in such case, 
100-1-200-4-3 4 1 
a zi Ü 053-609 A 000 =250 m. р. В. 
This average gives a wrong idea as can be proved by taking 
the above example. 
"The aeroplane took 1 Hr. in travelling 1st side of 100 miles 


:3 "я 30 minutes " 2nd „ № 
» » 20 „ » 3rd „ » 
„ arak ДЕД A АЖЕ. Та; » 


Total time taken 2 Hrs. and 5 minutes for 400 miles. 
275 Hrs. are taken for flying 400 miles 

Т 400X12 

» = SIRO: SN 
" —192 м. р. В. 

Hence Н.М. is the suitable average in this case. 

Discrete Series. In а discrete series reciprocals of 
different sizes are taken out and they are multipled by the 
number of frequencies of the respective classes. The totals of 
reciprocals multipled by respective frequencies is divided by the 
total number of frequencies. 


Jllustration—42 
Caleulate Harmonie mean from the following data :— 


Size Frequency 
8 20 
5 40 
7 30 
9 10 
Тоїа1 100 
Size Frequency ^ Reciprocals of sizes Rec.Xf 
8 20 .9338 6.6666 
5 40 .2000 8.0000 
ve 30 .1429 4.2870 
9 10 1111 j 1.1111 
> 20.0647 
1 l4 20.0647 
'H.M.—Reciprocal o: 10077. 
= 200647 


» 


—4.854 
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Continuous Series 
Illustration—43 


Find the Harmonie Mean for the following distribution :— 


Class Frequency Class Frequency 
40—50 19 70— 80 72 
50—60 25 80— 90 51 
60—70 86 90—100 48 
„—————————— 
a 
Class x M.V. f x Rec.X f 
40— 50 45 19 0222 4218 
50— 60 55 25 .0182 4550 
60— 70 65 86 .0154 .5544 
70— 80 75 72 .0133 9576 
80— 90 85 51 | .0118 .6018 
90—100 95 48 .0105 .4515 
"Total 246 8.4421 
H.M Reci T 3.4421 
-M.=Reciprocal of 7; 


=71.4. 


Advantages of Harmonic Mean 


1—Harmonic Mean is calculated after taking into account 
all the items of the series. 

2—It gives less weight to large items and more weight to 
small items. 

3—This average is useful in the case of a series having 
wide dispersion. 


Disadvantages 

(1) It is difficult to compute and is not understandable to 
the common man. : 

(2) Harmonie mean can be computed only when all the 
items of the series are known. 

There are a number of other measures of central tendency 
which are of mathematical and theoretical rather than of 
practical interest. They are discussed below :— 


QUADRATIC MEAN 


If each observation is squared, the arithmetic mean of the 
squares is computed and the squareroot of this mean is taken 
the result is quadratic mean. The quadratic mean is larger 
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than the arithmetic mean of the same observations. The 
formula for caleulating it is— 


pe peta... à 
о 


Q.M.— 


Illustration—44 


Find out the Quadratie Mean of the following prices of 
some commodities given in rupees. 


Commodity Price per md. 
in rupees 
Wheat 16 
Rice 20 
Sugar 40 
Potato 10 
Oil 75 
Commodity Price per md. | Square of prices 
Wheat 16 256 
Rice 20 400 
Sugar 40 1600 
Potato 10 100 
Oil 75 5625 
N=5 | 7981 
7981 
Q.M.— NEES 
5 
= ,/ 1596.2 


89.9 Rs. approx. 
MOVING AVERAGES 


It is a form of arithmetic average computed to obtain a 
new series by dropping of—the earliest item value and taking 
in its place the succeeding items. It may be calculated 3 yearly, 
4 yearly, Б yearly, 7 yearly ete basis. Symbolically 
atbte b+tet+d c+d+e d+e+f 

3 ? 3 ? 8 5 891 
e+dte+f 

4 А 


М.А. (8 yearly) — 


d b dte 
М.А. (4 yearly) = EU К 2200 Es А 


atbtctdte btce+dte+f ctdtetfitg 
5 2 5 


М.А. (5 yearly) — y Б 


А 0:56 99 | 35| бЁ 
09'I6 801 00°%% 00:2 88 9:55 39 | 851 8I 
65:55 991 05`55 IIL 00:85 00`55 88 9:15 3958 Boa ДШ 
ev eG POL 09°32 301 9665 00" 88 91 99 | oc | 9I 
LYS BAL 09°83 SII 9686 08'G6 06 9:55 89 sad 9I 
19:95 641 0795 261 91696 007 96 Ega УАН ты) 
00:98 681 OF LS бет 38945 $496 401 0:85 »8 85 | 81 
87:95 881 08°43 LSI 09°83 04`86 VII 9`65 68 Ie 61 
00°9% 681 07-45 дет 348°45 71986 VII 9'86 98 og «LE 
eres 84I 08:95 661. 348'95 746 601 0:95 84 35 | OI 
2975 GAL 08`86 611 348'85 Ox v6 86 9'66 89 $6 6 
Les 291 05:55 III OLS 95'55 68 EIZ +9 05 | в 
2915 ISI 07-15 LOL 9615 Os Ts 98 0:1 89 Xe ee 
00`Т5 LVL 09°02 вот 348'05 00'I6 78 &IG T9 55 9 
PLS ВРТ 08`0& FOL 92705 91705 $8 9:0 29 15 ako 
00-55 тат 07-15 LOL 00'TS 92705 ES $'05 I9 | 6t | я 
05:58 III Q48'I& <{ яе RR 812 т Ie | g 
0955 06 9:25 IA 5 | c 
95 | т 
әЗеләлу | SREL oSei9^V | SoL рәәл1иәгу әЗвләлу SoL әЗеләлү | зтезот, | ш |'oN'S 
Хае д Күлвәќ € Ajeak р р 


езер ZurMo[[gou) шолу 5э8влоле 8шлош Артеэд ), pue $ F “$ pme 


Sf чот 
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lf period of moving average is odd there is no difficulty in 
placing the average. In the case of 3 yearly moving average, 
the averages will be placed before the 2nd, 3rd, 4th and so on 
items. In the case of 5 yearly moving average the average 
will be placed before 3rd, 4th, 5th and so on item. If the period 
is say 4 or 6 year, then the averages are to be centred. This 
15 шиѕтгатеа in the examples at page 224. 


PROGRESSIVE AVERAGE 


А progressive average is a cumulative average used 
occasionally during the early years of the life of a business. 
This is computed by taking all the figures available in each 
succeeding year. 


Thus the average for the different years will be :-— 


a+b atbte atbtc+d a+tbtc+dte 
2 , Cy E] 4 , 5 se eee 


where a, b, c, d, e etc. represent the different items in a series. 
Illustration —46 


Calculate progressive average from the following data. 


Year Profits Progressive Items included | Average 
totals 

1951 | 37,000 

1952 39,000 76,000 2 38,000 
1958 44,000 120,000 8 40,000 
1954 48,000 168,000 4 42,000 
1955 42,000 210.000 5 42,800 
1956 36,000 246,000 6 41,000 
1957 48,000 294,000 7 42,000 
1958 50,000 344.000 8 43,000 
1959 47,000 391,000 9 43,444 
1960 38,000 428,000 10 42,000 


Composite Average. A composite average is an arithmetic 
average computed by taking out an average of various averages. 
Tt is calculated by the following formula. 


ajtasta,...-. an 


Сону No. of series 


The following illustration will make its calculation clear :— 
15 
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Illustration—47 


Given below are the monthly sales in thousand of rupees 
of The Gwalior Stores Ltd. for the years 1957, 1958 and 1959. 
Caleulate the composite average :— 


1957 1958 1959 
Jan. 15 86 40 
Feb. 18 48 85 
March 24 60 50 
April 80 84 70 
May 45 68 90 
June 54 60 55 
July 30 44 40 
Aug. 27 40 25 
Sept. 21 40 20 
Oct. 12 32 45 
Nov. 6 82 45 
Dec. 6 32 25 
288 576 540 
xn 288 
Не 74 
=EN 12 
Z 576 
Bcc eim epe d 
уш ооу лыр 
а А 12 
а, Рага, 244-48-45 
Composite average— "us ire Ms EE 
n 3 
117 
= 
=39 


Relations between different Averages. A study of rela- 
tionship between different averages can be suitably made if 
averages are classified into two groups (1) Mean, Median and 
Mode and (2) Mean, Geometrie mean and harmonic mean. 


(1) Mean, Median and mode—If we interpret these three 
measures in terms of the smooth frequency curve which we 
should get if we had a very large number of observations, we 
can say that the mean is the value of the variable which is the 
point of balance or centre of gravity of the distribution, the 
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median is the value which divides the distribution exactly in 
half, and the mode is the value at which the peak of the 
distribution occurs. In a symmetrical distribution all three 
measures coincide, ie. а=М=7, but skewness pushes the 
measures apart. 


In a moderately asymmetrical distribution. 


Mode—3 Median—2 Mean 
or Symbolically Z—3M—2a 


and 2a —3M—Z 
and ЗМ=2а-- 7 


or (M—Z)— 2 (a—Z) 
and a—Z—3(a—M) 


Illustration—48 
In a moderately asymmetrical distribution determine the 
value of the following :— 


(i) Mode if Median—27.5 and the Mean =30.5 
(ii) Median if Mode=15.4 and the Mean —18.6 
(iii) Mean if Mode —50.2 and the Median—4( 


(M—2)—$ (a—Z) 


(1) (27.5—Z)=2 (30.5—Z) 
61 27 
or (201.58—2)——, — 5 
or 82.5—32— 61—27 
Z—21.5 
This is also the value of Z by the equation 
а—7=3(а—М) 
30.5—Z— 3(30.5—27.5) 
30.5—Z— 9 
—Z— 9—30.5 
Z= 21.5 
Gi) (M—154)—2 (18.6—15.4) a—Z—3(a—M) 
M—15.4= 2.18 18.6—15.4— 8(18.4—M) 
M= 2.13-115.4 18.6—15.4— 55.2—3M 
=17.53 ЗМ=55.2—18.6-- 15.4 
3M— 52 


M=17.3 
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(ii) (40—50.2)=2 (a—50.2) a—Z—3(a—M) 
100.4 a—50.2—3(a—40) 
ез = 2а— — 
dones Sar vg a—50.2— 3a —120 
120—150.6—2а—1004 a—93a— —120-1-50.2 
— 2a— —100.4—120-1150.6 атор 
а—34.9 а—351 


(2) Mean, Geometric Mean ата, Harmonic Mean—If all the 
items in a variable are the same the arithmetic mean, the 
geometric mean and the harmonic mean are equal. 


Symbolically a—G.M.—H.M. 


But if the sizes vary, as will generally be the case, 
arithmetic mean will be greater than Geometric mean which 
will be greater than the harmonic mean. This is because of 
the property of the geometric mean to give larger weight to 
snialler item and of the harmonic mean to give the largest weight 
to the smallest item. Hence a 2 G.M. > Н.М. 


Illustration—49 
Prove that 
а >СМ > HM 
(B. Com. B.H.U.) 
We have to prove that arithmetic average, geometric mean 
and harmonie mean are equal in some serjes and also in some 
series arithmetic mean is greater than geometric mean which 
is greater than harmonic mean. 


(i) If different items in a series are the same the arithmetic 
mean, geometric mean and harmonic mean will be equal. 

Suppose there are two items x and y in a series and both 
are of equal value say of ‘3’. Then 


MMC id —8 H.M.—Rec. 3+3 
2 2 
G.M.—A 3x8 —3 8: 
(ii) If x and у are not of the same size then 
BE s Iz mae 2 or 2xy 
= G.M.— J xy, H.M.— т] zi 


x 3 
As these two values are not of equal size hence it is definite 
that 
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EDUC VIR 

or x—2V xw y--y 20 

or x-Ly—2 Jxy 20 

or xy 22 ху 

If we divide both the sides by 2 


then Lay 


or a>GM 
If these two sides are multiplied by AVES then 
x+y 
(А yo IST 
Е Y 
ат 
E 2xy 
М ху 
nS 
ог GM> Н.М. 
Q.E.D. 


Uses of the different Averages. The choice of а particular 
average is usually determined by the purpose of investigation. 
Within the framework of description statisties, the main 
requirement is to know what each average means and then select 
the one that fulfils the purpose at hand. The nature of 
distribution also determines the type of average to be used. 
For example if the distribution is symmetrical or approximately 
so, then a or M or Z may be used almost interchangeably. 

Only one average cannot indicate the features of the 
central tendency of a series. Hence it is necessary that before 
deciding the type of average to be used, a study should be made 
of the nature, constitution distribution of frequencies etc of 
that series. Besides these, there are other considerations also, 
usually of secondary importance, in selecting an average. 

(1) In certain commonly encountered applications, the mean 
is subject to less sampling variability than the median or Mode. 

(2) Given only the original observations, the median is 
sometimes easiest to calculate. Sometimes when there is no 
strong advantage for the mean, this advantage is enough to 
indicate the use of the median. 
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(3) Once a frequency distribution has been formed, the 
mode and the median are more quickly caleulated than the mean. 
Moreover when some classes are open ended the mean cannot 
be calculated from the frequency distribution. 


(4) The median is not a good measure when there are very 
few possible values for the observations as with number of 
children or size of family. 


(5) The mode and the median are relatively little affected 
by ‘extreme’ observations. 


(6) Caleulation of geometric mean and harmonic mean is 
difficult as it involves the knowledge of logarithms and 
reciprocals. 


Taking the above point into consideration let us examine 
the uses of different averages :— 


(1) Arithmetic Average :—The arithmetic average is used 
in the study of a social, economic or commercial problem like 
production, income, price, imports, exports etc. The central 
tendency of these phenomena can best be studied by taking out 
an arithmetic average. Whenever we talk of an ‘average 
income’ or ‘average production’ or ‘average price’ we always 
mean arithmetic average of all these things. Whenever there 
is no indication about the type of the average to be used, 
arithmetic average is computed. 


(2) Weighted Arithmetic Average :—When it is desirable 
to give relative importance to the different items of a series, 
weighted arithmetic average is computed. If it is desired to 
compute per capita consumption of a family due weights should 
be assigned to children, males, females. This average is also 
useful in constructing index numbers. The weighted arithmetic 
average should be used in the following cases :— 


(a) If it is desired to have an average of a whole group 
which is divided into а number of sub-classes, widely divergent 
from each other. 

(b) When items falling in various sub-classes change in 
such a way that the proportion which the items bear among 
themselves also undergoes a change. 

(c) When combined average has to be computed. 


(d) When it is desired to caleulate to find an average of 
ratios, percentages or rates. 
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Median. Median is specially applicable to cases which are 
not capable of precise quantitative studies such as intelligence, 
honesty, etc. It is less applicable in economic or business 
statistics, because there is lack of stability in such data. 


Mode. The utility of mode is being appreciated more and 
more day by day. In the sciences of Biology and Meteorology 
it has been found to be of great value. Though in commerce 
and industry it is gaining very great importance. Whenever a 
shop-keeper wants to stock the goods he sells, he always looks 
to the modal size of that goods. Modal size of a collar, or 
Modal size of shoes, is of great importance to the businessmen 
dealing in ready made garments or shoes. Many problems of 
production are related with mode. Many business establishments 
are these days engaging their attention in keeping statistics 
of their sale to ascertain the particulars of the modal articles 
sold. 

Geometric Mean. Geometric mean can advantageously be 
used in the construction of index numbers. It makes the index 
numbers reversible and gives equal weight to equal ratio of 
changes. This average is also useful in measuring the growth 
of population, because population increases in geometric 
progression. When there is wide dispersion in a series geometric 
mean is a useful average. 


Harmonic mean. This average is useful in the cases where 
time, rate and prices are involved. When it is desired to give 
the largest weight to the smallest item, this average is used. 


Illustration—50 


How will you find— 
(a) the average marks of a class of students to show the 
level of intelligence. 
(b) the average cost of goods purchased in different 
lots to determine the selling price. 
(в) the average size of groups of items for the purpose 
of classification, and 
(d) the average rate of increase in prices when the prices 
increase at different rates during successive periods ? 
Explain why you should adopt a particular method in 


each case. 
(B. Com. Agra) 
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(а) Median (b) Weighted arithmetic average (c) Mode 
(d) Geometric Mean. 


(For reasons please see their uses) 


'Theoretical Questions 


1. Explain the uses of the different types of averages, with 
Ulustrations. 
(B. Com. Luck.) 


2. What is meant by ‘Central Tendency’? Describe the 
measures of measuring central tendency. Point out the usefulness 
and limitations of each method. 

(B. Com. Bombay) 

8. What is the purpose served by an average? Discuss 
the special advantages attached to the different averages and 
illustrate their uses. 

(B. Com. Agra) 

4. Write a note.on the relative merits and uses of the 
following averages :— 

(a) Arithmetic Average 
(b) Median 
(с) Mode 
(d) Geometric Mean 
(e) Harmonic Mean 
(B. Com. Agra) 

5. What is an average ? Under what circumstances would 

you use the following :— 


(a) The mode instead of arithmetic average. 
(b) The geometric average instead of arithmetic average. 
(c) The arithmetic average instead of median. 

(B. Com. B. H. U.) 


6. Discuss, giving examples, the merits and defects of the 
averages generally employed in business Statistics. 


(B. Com. Luck.) 


7. Define arithmetic average, geometric mean, median, and 
mode. Which ot these is most representative and why ? 


(M. A. Agra) 


8. Which of the averages will be most useful in the 
following problems ? Give reasons— 


(a) Per Capita consumption of food in а family 
consisting of children, women and men, 
(b) average earnings of a pleader 
(c) normal size of a hat for hat manufacturers 
(d) average size of oranges on a tree. 
(B. Com. B. H. U.) 
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9. Define an average and give the properties that a good 
average should possess. In the light of these properties compare 
the advantages and disadvantages of the different averages and 
indicate in which of problems can each one of them be used with 


the greatest advantage. 
10. (a) Define what is meant by an average of first order ? 


(b) What are the conditions which an average should 
satisfy, illustrating with reference to arithmetic 
mean and examining its uses and limitations ? 


11. What is a statistical average ? What are the desirable 
properties for an average to possess ? Which of the averages, 
you know possess most of these properties ? 

(M. A. Delhi) 


12. Point out the advantages and disadvantages of the chief 
kinds of averages used in Statistics. 
(M. A. Caleutta) 


13. What аге the functions of a statistical average ? 
Explain by taking suitable examples the use of any three of the 
averages generally used in statistical work. 

(B. Com. Luck.) 


14. 'An average is substitute for a complex group of 
variables but it is not always safe to depend on the substitute 
alone, to the exclusion of individual measurements of the people. 


Discuss. 
(B. Com. Alld.) 


15. Define a weighted average. How does it differ from 
an unweighted mean ? Discuss critically the use of weighted 
mean in statistics describing the cases in which the weighted mean 
is better than unweighted mean. 
(B. Com. Caleutta) 


16. Comment: “The arithmetic mean has the disadvantages 
of being affected by extreme values and may therefore give a 


biased figure." 
Practical Questions 


1. Obtain the Mean, Median and the Mode of the following - : 
distribution :— 


Marks Frequency 
10—25 6 
25—40. 20 
40—55 44 
55—70 26 
70—85 8 
85—100 


(M. A. Agra): 
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2— Calculate the arithmetic average and the Median from the 
following data :— 


Age No. of People | Age No. of People 
55—60 7 85—40 80 
50—55 18 80—85 88 
45—50 15 ‚25—80 28 
40—45 20 20—25 14 

Total 160 


(B. Com. Luck.) 
(a=87.06 years, М=35.92 years after reversing the series) 
8—Calculate the arithmetic mean of the following distribu- 


tion :— 
Profit per shop No. of Shops 
0—10 12 
10—20 18 
20—30 27 
30—40 20 
40—50 17 
50—60 6 


Find also graphically the value of Median. 
(B. Com. Bombay) 
(a—Rs. 28.0, M—Rs. 27.6) 


M 4—Find the average marks of a student from the following 
table :— 


Marks Number of students 
Below 80 240 

aw. 70 190 

» 60 125 

» 50 95 

» 40 75 

» 90 60 

E 20 40 

” 10 25 


(В. Com, B.H.U.) 
(The series will be rearranged a=49.58 marks) 


5—(a) In chemistry a student was graded 85 in class, work, 
80 in laboratory and 65 in final examination. If these were 
weighted 1, 2 and 3 respectively, what was the students’ average 
grade ? 

(b) The mean grade of one class of 20 students is 66%, and 
that of another class of 15 students is 70%. Find the mean grade 
of the two classes taken together. A 

(Weighted а=78.3% and 67.7% respectively) 

6—Explain what is meant by weighted average. 
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Caleulate (i) the unweighted mean of the prices in column 
III and (ii) the mean obtained by weighting each price by the 
quantity consumed. 


I II III 
Articles of Food Qty. consumed Price in rupees per md. 

Flour 11.5 Mds. 5.8 

Ghee Б.б! 58.4 

Sugar 0:28 5, 8.2 

Potato 0.16 , 2.5 

Ой 0.85 4 20.0 


(М.А. Caleutta) 
(a—Rs. 18.98, and weighted average Rs. 22.55) 


7— From the results of the two colleges A and В given below 
state which of them is better and why ? 


College A. College B 
Name of us 
Exam. Appeared Passed Appeared Passed 
M.A. 30 25 100 80 
M.Com. 50 45 120 95 
B.A. 200 150 100 70 
B.Com. 120 75 80 50 
Total 400 i 295 400 295 


(B. Com. Luck.) 
(M. Com. Vikram) 


(First find out the percentage of passes in each examination 
and then calculate weighted average. Waj=73.75% and wag= 
73.7%.. College A is better. 

8— The table below gives the marks obtained in Advanced 
Accounting by the students with Roll Nos. 1 to 10 at the Final 
Chartered Accountants’ Examination :— 


Roll No. Marks Roll No. Marks 
obtained obtained 


© D Dnm 
о 
“ 

© о о-їФ 
> 
со 


© 
E 
— 
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Calculate the (a) Mode, (b) Median, (c) Arithmetic Average. 
[Chartered Accountants! Examination.] 

Ans. Mode .. 52.3 

Median .. 52.5 

Arith. Average й40452.6 


9— Тһе table given below has been constructed from data 
obtained in a factory showing the distribution of the number of 
processed articles per day per person and the rate of payment. 


Daily no. of articles No. of persons Rate of payment per 
processed per person. processing article processed 
(pies) 
80—99 12 3.1 
100—111 63 3.2 
120—139 87 8.8 
140—159 56 8.4 
160—169 8 3.5 


This means that there are 12 persons each of whom can 
process between 80 to 99 articles daily at the rate of 3.1 pies 
per article processed, similarly for other figures. 

Caleulate the rate of payment per person per article 
processed. 


[Cal. Univ. Dip. S.W.] 


(Ans. 3.35 p.) 

10—Ten coins are tossed 1024 times. Тһе theoretical 
frequencies of 10 heads, 9 heads, ........ ‚ I head, 0 head are 
given below :— 
No. of heads Frequency No. of heads Frequency 

0 1 6 210 

1 10 7 120 

2 45 8 45 

Б] 120 9 10 

4 210 10 1 

5 252 


Calculate the mean number of heads per tossing. 

[Cal. Univ. Business Management.] 
(Ans. Б) 
11—A. customer parking garage of а godown store finds that 
the number of cars that seek parking facilities varies greatly from 
day to day over the week. Last year the store had 20 attendants 
in the garage on Mondays, 10 on Tuesdays. 15 on Wednesdays, 

10 on Thursdays, 5 on Fridays and 30 on Saturdays. 


a—Find the mean number of attendants for the week. 
b—Find the median number of attendants for the week. 
c—Find the modal number of attendants for the week. 
(Ans —a=15, M=12.5, 7—10) 
(a—47'95, M—48*35, and Z=—48" 57) 
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12—Given the following frequency distribution, calculate the 
Arithmetic Average :— 


Monthly wages Workers 
12.5—17.5 2 
17.5—22.5 22 
22.5—27.5 19 
27.5—82.5 14 
32.5—87.5 8 
87.5—42.5 4 
42.5—47.5 6 
47.5—52.5 1 
52.5—57.5 1 


(M. Se. Ag. Punjab) 
(Ans. a—Rs. 27.85) 


13—You take a trip which entails travelling 900 miles by 
train at an average speed of 60 miles per hour, 3000 miles by 
boat at an average of 25 m. p. h., 400 by plane at 350 m. p. В. 
and finally 15 miles by taxi at 25 m. p. h. What is your average 
speed for the entire distance. 


(Ans. H.M.—40.1 approx) 


14— Calculate the modal and median incomes from the 
following distribution :— 


Income per month Number of families 
in Rupees 

15—20 80 
20—30 120 
30—40 201 
40—50 150 
50—60 75 
60—70 25 
оуег 70 19 

Total 670 


(Ans. M—86.7, Z—86.1) 


15— The following table gives the number of persons with 
different incomes in the U. S. A. during the year 1929 :— 


288 AN INTRODUCTION TO MODERN STATISTICS 


Income in thousands of No. of persons 
dollars in lakhs 
Under 1 AD ss 18 
1— 2 v. R 90 
2— 8 $^ ais 81 
8— 5 oe Sis 117 
b— 10 .. o 66 
10— 25 ee S 27 
25— 50 .. els 6 
50— 100 os asd 2 
100—1000 “+ Ms 2 
Calculate the average income per head. (B. Com., Luck.) 


(Ans. $8.06 thousands) 


16— The following table gives the male population of Kanpur 
and Jaipur in 1931 :— 


Age group (Years) Population of males in thousands 
Kanpur Jaipur 

Ü—. 5 .. va 14 9 

5—10 os т" 18 8 
10—15 . 441+: эў 18 8 
15—20 ss i: 18 7 
20—30 sy on 88 15 
80—40 Де) ПЕ 29 12 
40—50 Gx dU 17 9 
50—60 P s т 6 
60—80 Ee 4 4 


Caleulate the average age of males at Kanpur and Jaipur 
separately and account for the difference, if any. 

(B. Com., Allahabad, 

(Ans. Kanpur 26.5 years, Jaipur 27.1 years) 

17—The frequency distribution below gives the cost of 


production of sugarcane in different holdings. Obtain the 
Arithmetic mean. 


Cost Frequency Cost Frequency 
2— 6 1 18— 52 
6— 9 22— 86 
10— 21 26— 19 
14— 27 30—34 8 
(I. A. & А. S.) 


(Ans. a=19.2128) 

18— Calculate the arithmetic, the geometric and the harmonic 
means and the median from the following figures :— 

375.5, 158.4, 28.5, 12.01, 4.5, 3.74, 12.79, 35, 41.9 and 58. 

(B. Com. Alld.) 

(Ans. а=87.06, G.M.—28.56, H.M.— 183.08, M=31.75) 

19—Make a frequency table having grades of wages with 

class intervals of two annas each from the following data of daily 
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wages received by 30 labourers in a certain factory and then 
compute the average daily wages paid to labourers. 

Daily wages inannas : 14, 16, 16, 14, 22, 13, 15, 24, 12, 28, 

14, 20, 17, 21, 18, 18, 19, 20, 17, 16, 

15, 11, 12, 21, 20, 17, 18, 19, 22, 28. 

(В.А. Нопз., Punjab) 

(Апз. а=18 Аз.) 

20—Below are given the marks obtained by а batch of 

students appearing in Statistics in the Certificate Course 
Examination, maximum marks in the paper being 50 :— 

14, 22, 25, 15, 11, 88, 28, 26, 22, 80, 18, 16, 27, 82, 

19, 12, 21, 18, 16, 10, 31, 29, 28, 24, 17, 28, 20. 

Find out (a) the median marks directly and (b) the median 
marks after classifying the given marks into class-intervals of 
10-15, 15-20, ete. Account clearly for the difference, if any, 
between the two values of median so computed. 

(B. Com., Allahabad) 
(Ans. (a) 22 marks, (b) 22.14 marks) 


21—The following table gives the age distribution of married 
females according to sample census of 1941 in the Baroda State :— 


Age No. Age No. Age No. 
0—5 Б] 25—80 2,223 50—55 581 
5—10 81 80—35 1,728 55—60 817 

10—15 410 85—40 1,292 60—65 156 
15—20 1,809 40—45 963 65—70 59 
20—25 2,446 45—50 762 70—75 87 


Calculate the median age of married females and also the two 


quartiles. 
(Ans. M=28.78, Q,— 21.09, Q5—38.6) 
29 — Calculate the values of the median and the two quartiles 
for the following :— 


Limits of percentage Factories in India 
recovery of sugar cane (1935-86) 
8.0— 8.2 oe sis * 2 
8.2— a .. a 5 
8.4— 43 d sie 4 
8.6— .. EM MA 11 
8.8— ee oe on 11 
9.0— .. ee БС 11 
9,2— .. we oe 18 
9.4— 10 
9.6— 7 
9/8 — dm 6 
10.0— `x 8 
10.2— oe 1 
10.4—10.6 1 


(M.A., Punjab) 
(Ans. M—9.182, Q1—8.79, Qs=9.55) 
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23—Determine the quartiles and the median from the 
following table :— 


Income No. of persons 

Below Rs. 30 69 
Rs. 30 and below Rs. 40 167 
Rs. 40 and below Rs. 50 207 
Rs. 50 and below Rs. 60 65 
Rs. 60 and below Rs. 70 58 
Rs. 70 and below Rs. 80 27 
Rs. 80 and over 10 

603 


(B. Com. Bombay) 
(Ans. M=48.18, Qi=34.91, Qs=51.54) 


24—Find out the average of (A) motion in case of a person 
who rides the first mile @ 10 miles an hour, the next mile @ 8 
miles an hour, and the third mile @ 6 miles an hour. (В) increase 
in population which in the first decade has increased 20%, in the 
next 25% and in the third 44%. 


(Ans. (A) H.M.—7 dp mph, (B) G.M.—29.39,) 


25—The number of bacteria in a certain culture was found 
to be 410% at noon on one day. At noon the next day the 
number was found to be 9Х 108. If the number increased at a 
constant rate per hour, how many becteria were there at midnight ? 


(Ans. G.M.— 4 (4X109)x(9X109) =6Ж10°) 


26— + x, and x» are two positive values of a variate, prove 
that their geometric mean is equal to the geometric mean of their 
arithmetic and harmonic means. 


27—Fifty items sold in Department A of the Corner Store 
had a mean price of 30 cents. Seventy five items sold in 
Department H had a mean price of 20 cents. Find out the mean 
price of commodities sold in Departments А and Н. 


(Ans. Combined Mean=24) 


28—The following table gives the distribution of the male 
and female population of a certain area in India. By finding the 
median age and the upper and lower quartile ages, comment on 
the age distribution of the two sexes in the are» :-— 
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Age group Male Female 
0— 9 2,756 2,787 
10—19 2,124 2,082 
20—29 1,677 1,724 
80—39 1,481 1,485 
10—49 1,021 1,022 
50—59 610 579 
60—69 245 269 
70—79 67 78 
80—89 16 20 
90—99 8 4 
Total 10,000 10,000 
(I. С. 5.) 


(Ans. Males : М=20.22 years, Q,—8.62 years, Q3=35.87 years. 
Females : М=20.55 years, Q,—8.52 years Q4— 35.95 years) 


29—Define the 'Ogive' of a frequency distribution. Draw 
the Ogive of the following data giving the percentage of persons 
of different ages employed in a factory :— 


Age Percentage Age Percentage 
16—20 8.6 41—45 10.7 
21—25 9.8 46—50 9.1 
26—80 27.4 51—55 5.1 
81—85 20.4 56—60 0.6 
86—40 13.3 


Read from the diagram the median age and the two quartile 
ages. Verify the calculations. 


(Ans. M=82.88, Qi=27.66, Qs—41.1 years) 


30—According to the census of 1941 the following are the 
population figures in thousands of the first 36 cities of India :— 


2488 391 208 178 860 176 
1490 181 777 258 213 147 
733 437 176 143 522 284 
193 181 672 ° 802 160 153 
591 268 213 142 407 260 
169 92 387 239 204 151 


Find the median and the quartiles. 
(M. Com., Agra) 
(M=226, Q1—170.75, 92—403, after arranging the popula- 
tion figures in ascending order) 3 1 
31—Find the Mode, the Median and the Quartiles of the 
following series :— - и) 
16 
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Size Frequency Size Frequency 
4 40 12 50 
5 48 18 52 
6 52 14 41 
7 56 15 57 
8 60 16 63 
9 63 17 52 
10 57 18 48 
11 55 19 40 


(B. Com., B. H. U.) 
(Z—9, М=11, Qi=8, Qs—15) 


82—The numbers of fully formed tomatoes on 100 plants 
were counted with the following results :— 


2 Plants had 0 tomatoes 
5 ээ a 1 3 
7 3? ээ 2 » 
1 1 35 P 3 33 
1 8 3» ээ 4 > 
24 » » 5 » 
12 > „ээ 6 E 
8 » эз 7 5 
6 ээ 35 8 p 
4 33 э? 9 35 
8 p P 10 э 


(i) How many tomatoes were there in all? 

(ii) What was the average number of tomatoes per plant ? 

(iii) What was the mode or modal number of tomatoes ? 
(Ans. (i) 486, (ii) a=4.86, (Ш) Z—5) 


88—The marks (out of maximum of 100) obtained by 
candidates in an examination are shown in the following frequency 
table. Calculate the Arithmetic Average and the Mode. 


Marks, Number of Candidates 
17.5—22.5 2 
22.5—27.5 8 
27.5—82.5 88 
82.5—87.5 80 
37.5—42.5 170 
42.5—47.5 248 
47.5—52.5 218 
52.5—57.5 145 
57.5—62.5 67 
62.5—67.5 85 
67.5—72.5 4 


(B. Com., Agra) 
(Ans. a=46.965, Z—406.04) 
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84—From the figures given below find the mode, median and 
quartiles ? What information could you deduce from them ? 


Age No. of Persons Age No. of Persons 


20—25 50 40—45 150 
25—30 70 45—50 120 
80—85 100 50—55 70 
85—40 180 55—60 59 
(В. Сош., Адта) 


(Ans. Z=88.6, M—40.0, Q1—34, Q3=47.1) 


85—Under what assumptions is mode located in a frequency 
distribution ? Compute the mode of the following distributions :— 


Size of item Frequency 

4— 8 10 

8—12 12 

12—16 16 

16—20 14 

20—24 10 

24—28 8 

28—32 17 

32—36 5 

36—40 4 
(В. Com., Alld.) 
(Ans. Z—14.67) 


36—Find the Median, Lower Quartile, 7th Decile and 85th 
Percentile of the frequency distribution given below :— 


Marks in Statistics 


Marks group No. of Students 

Under 10 8 

10—20 12 

20—30 20 

30—40 82 

40—50 80 

50—60 28 

60—70 12 

70 and above 4 
(B. Com. Alld.) 
` (M=40.5, Q:=28.375, О:=50.32, Pss=58.2) 
37—Draw a cumulative frequency graph of the following 
distribution showing the monthly wages of a group of workmen 
and hence or otherwise calculate the value of (a) the Mode, (b) the 

median and (c) the two quartiles :— 
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Wages 28 
in Rs. |20— |21— |22— |28— |24— 25— |26— | 27— |—29 


(М.А. Rajputana) 
(Ans. Z=25.8, M=24.775, Qi=28.08, Qs=26.05) 
38—Attempt any three of the following :— 

(a) If the Mode and the Mean of a moderately 
asymmetrical series are respectively 16 inches and 
15.6 inches, what would be its most probable Median. 

(b) If: а series which is not highly skewed the Mean 
Deviation is 7.8 feet, what would be the approximate 
value of its standard Deviation ? 

(c), A man travels 50 miles at a speed of 20 miles per 
hour and then returns at a speed of 80 miles per hour. 
What is his average speed for the whole journey. 

(d) Find the average rate of increase in population which 
in the first decade has increased 2075, in the next 
30% and in the third 45%. 


(Ans. (а) M=15.7, (b) o=9.75, (с) H.M.—4, (d) G.M.—80) 


59— Caleulate the Mode and the Arithmetic average from 
the folowing series and account for the difference, if any :— 
ГА 


Size of the item Frequency 
6—10 oe ER 20 
11—15 "Y o 30 
16—20 A c 50 
21—25 vy ám 40 
26—80 oe 10 


(В. Com. В. Н. О.) 
(Ans. Z—18.8, a=17.67, after amending class intervals as 
(5.5—10.5) and so on.) 


40—Find the arithmetic average, median and the Quartiles 
from the following distribution of 100 persons by age :— 


Age last birth day Е Number 

15—19 an dE 4 

20—24 Em 20 

25—29 88 

80—84 24 

85—39 D de 10 

40—44 33 $a 4 | 
(М.А. Alld.) 


(Ans. a—28.4, М=27.9, Qi=24.66, ©5—=89.59) 
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41— Define the Mean, the Median and the Mode. Find their 
values in the case of the heights of trees in a garden whose 
frequency distribution is given the following table :— 


Heights 
Under 7 feet 

э 14 » 
» 21 » 
» 2 8 э 
» 8 5 » 
» 42 & 
» 49 » 
» 56 » 


Frequencies 


9360 
(M.A. Agra) 


(Ans. a—80 feet 1 inch, 7=88 feet 6 inches, M—81 feet 
11 inches. It is a ‘Below Table'.) 


42—Find the average marks of a student from the following 


table :— 


Marks 


Below 


80 
70 
60 
50 
40 
30 
20 
10 


Number of Students 
a 240 
190 
125 
95 
75 
60 
40 
25 


(B. Com., B. Н. U.) 


(The series will reversed and arranged a—49.58) 
43—Find out the Median and the Mode from the following 


table :— 
No. of days absent Numbers. 

Less than 5 = T 29 

» 10 224 

à TM as 465 
7.90 582 
ОР; 634 

» 80 644 

MO den 650 

RA 0 653 

45 655 


(В. Com. Luck.) 
(Ans. M—12.2, Z—11.85 days) 


44—Recast the following cumulative table into the form of 


| 
| 
`В 


an ordinary frequency distributi 
mode by using the formula :— 


ion and determine the value of 
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Mean—Mode—3 (Mean— Median) 


No. of days absent No. of Students 
Less than 5 fs = 29 
ЧЕТО af .. 224 
5 15 ac DEO EAS 
» 20 .. s^. 582 
» 25. os s 634 
ae BO 23 400 644 
II Ug IN SAINT 5G 
DAD А 2. 5.658 
». 45 " Е 655 


(В. Com. Bombay) 


(Ans. a=12.9, M—12.2, Z—10.8. By usual formula Z— 
11.85) 


45—From the following table calculate mean, and median. 
By graph verify the median. Crop—Cutting Experiment Data on 
Plot Yields of Wheat. 


Yield in Ibs. No. of Plots 
Over 0 AG ТА 216 
s 60 210 
x 120 156 
„ 180 98 
» 240 57 
53-800 31 
» 860 18 
» 490 7 


(B. Com. Saugar) 
(Ans. a—188.9 Ib, M—170.2 Ib.) 


46—Frequency distribution of marks obtained by a class of 
Students shows the following :— 


Marks Number of Students 
0— 30 S D 10 
80— 40 “э Do 15 
40— 50 Fi ae 30 
50— 60 о с 32 
60— 70 ae CR 8 
70—100 os sTo 5 


(a) Find the median by drawing the Ogive Curve. 


(b) Check up the value af the median so found by using 
the standard formula for finding the median. 


(B. Com. B. H. U.) (Ans. M=48.5) 
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47—Find the median and the modal size from the following 


data :— 
Size of the item in ft. Frequency 

1—10 als 9 
10—19 A 13 
19—28 ee 86 
28—37 $ 289 
37—46 «ж 120 
46—55 a 46 
55—64 m 12 

(M.A. Alld.) 


(Ans. М=88.887, Z=88.06) 


48—A car travels at a speed of 30 miles per hour for the first 
40 miles, then at a speed of 35 miles for the next 40 miles, then 
at a speed of 45 miles for the next 40 miles, again at a speed 
of 38 miles for the next 40 miles, and at a speed of 35 miles for 
the next 40 miles. What is the average speed of the car on its 
journey ? 


(Ans. 35.97 m. p. h.) 


49—А sum of money was invested for five years. The 
average rates of return for the investment for the five successive 
years were as follows : 5.50 per cent., 4.73 per cent., 4.20 per 
cent., 3.91 per cent., and 4.64 per cent. What was the average 
Tate of interest for the five years ? 

(Ans. 4.56494) 


50— Calculate the geometric mean and the harmonic mean of 
the following figures :— 


1238 ; 178.7 ; 89.9 ; 78.43 9.7; .874; .989 ; .012; .008; 


.0009. 
(B. Com., Allahabad) 


(Ans. G.M.—2.019, H.M.—.007587) 
51—Monthly incomes of twenty families are given below in 
rupees :— 
2,000; 35; 400; 15; 40; 1,500; 300; 6; 90; 250; 
20; 19 ; 450 ; 10; 150; 8 ; 95; 30; 1,200; 60. 


| Caleulate the Geometric Mean and the Harmonic Mean of 


| the above incomes. 
Ё | (В. Com., Alld.) 


(Ans. G.M=78.11, H.M.—26.07) 
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b2— Calculate the geometric mean of the following two ` 


series :— 
(a) (b) 
9574 .8974 
475 .0570 
75 .0081 
5 .5677 
8 0002 
.08 .0984 
.005 0854 
.0009 .5672 


(Ans. G.M. (a)—1.841, (b) .06223) 


58— Following is the frequency distribution of the yield of 
cane in tons per acre :— 


Class-intervals Frequency 
35— m oe f 
40— .. .. 8 
45— m .. 12 
50— m ous 26 
55— n К 82 
60— si» sa 42 
65— . en 42 
70> he P 15 
75> “ vid 17 
80—85 і. ea 9 


—Calculate mean 

(M.A. Raj.) 

(Ans. a=61.83) 

54—The following data refers to the number of employees 


and their monthly earnings in two establishments. Calculate and 
compare their respective weighted averages :— 


А В 


Category of Employee No. of Monthly No.of Emp- Monthly 
Employees Earning loyees Earning 


Managers 8 800 2 750 
Supervisors 20 145 10 150 
Administrators 15 50 15 60 
"Technicians 25 80 25 50 
Skilled workers 80 85 40 40 
Workers 250 20 120 20 


(Wa (А) = Rs. 3774, Wa (B) = 49*2) 


55—Following table gives frequency distribution of 5% Bonds. 
Find out the modal value of the distribution :— 


a 


МФ 


—- 
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Quoted Price Frequency 
Less than 80 11 
| 80— 8979 F 7 
| 90— 99'9 14 
| 100—109'9 | 29 
110—119'9 | 7 
120—129'9 8 
130 and more | 2 
78 


(Ans. Z = 104°01) 


56—Еїпа the modal wage from the following table :— 


Wages in rupees No of Labourers 
above 30 520 
» 40 4TO 
ANTEO 399 
4460. 210 
„ 170 105 
„ 80 45 
ой 190 7 


(Ans. Z = Rs. 55°78) 


57—One hundred and twenty individuals firing at a moving 
target miss by the following distances, the positive and the negative 
signs corresponding to the shot being in advance or behind the 
target :— 


1 Shot is between +10 and +15 inch wide 
8 Shots are , 4-5 ^» +10 of 


20 » 27 „э 0 > Е 5 » 
25 » » » о » 0 » 
22 » » » —10 » ED » 
17 ” » » —15 » =O » 
18 » » » —20 > —15 » 
10 ” » » —25 » —20 » 


Find the average distance behind the target by which the 
shots tend to miss. 
(Ans, a = 87875”) 


58—In a group of 500 wage earners the monthly wages of 47; 
were under Rs. 60 and those of 15% were under Rs. 62°50. 15% 
of the workers earned Rs. 95 and over and 5% of them got Rs. 100 
and over. The median and quartiles wages were Rs. 82°25, 72°75 
and 90°50 ; the fourth and 6th Decile wages were Rs. 72°75 and 
85°34 respectively. Put this information in the form of a frequency 
distribution and estimate the mean wage. 
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Hint—Assume a range of 55—105, frequency distribution e 
be put in this manner. 


% % 
55 —60 4 82.25—85.3 10 
60 —62. 5 11 85.3 —90.5 15 
62.5 —72.75 10 90.5 —95 10 
72.75—78.75 15 95  —100 10 
78.75—82.25 10 100 —105 5 
100 

(Ans. a—80'9) 


59. The mean daily sunshine for Gt. Britain and Ireland 
for the years 1945—55 is given below. 


Month Jan. | Feb | Маг. | Apl. | May. | June | Таз | Aug-| Sept. | Oct, Nov.| Пес. 


Hrs, 1.49 | 2.40 | 3.62 | 5.21 | 5.81 | 6,25 | 5,46 | 5.82 | 4.41 | 2.99 1.85 | 8.40 


Find the median number of Hour’s sunshine per day. 


(В: бош. В.Н...) 
(Ans. М=4.015) 


60—The following figures show the monthly incomes of 700 
families in a certain locality :— 


Monthly Income Number of families 

= 93 

50— 205 
100— 157 
150— 109 
200— 64 
250— 41 
800— 22 
350—400 9 


(a) Draw ‘Less than’ and ‘More than’ Ogive curves and 
determine the value of median. 


(b) Check your result by using the standard formula for 
locating the median. 


(Cert. St. B. H. U.) 


61—The following data has been taken from a Government 
Publication :— 
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Percentage Unemployed Among Insured Persons 
(Average for the year) 
Year Males Females Year Males Females 


1941 16.1 8.7 1949 16.4 14.4 
1942 12.4 9.0 1950 22.4 17.7 
1943 10.8 8.5 1951 25.1 18.5 
1944 12.0 8.1 1952 28.1 11.2 
1945 18.2 9.5 1958 19.1 9.8 
1946 10.9 6.2 1954 17.8 9.4 
1947 12.2 6.7 1955 14.6 8.8 
1948 11.5 7.2 1956 11.8 TA 


Find out the average percentage of unemployed males and 
females among insured persons for the years 1941—1956 using 
(a) Direct Method and (b) Short cut Method. 


(M.A., B. H.U.) 
(Ans. a for males=15.59%, for females 9.747) 


62—The following table gives the number of families 
residing in different rent-class houses. Find the average rent 
paid by a family by computing the median. 


Monthly Rents No. of families paying rent 
Less than Rs. 5 192 ` 
5— 10 147 
10— 15 70 
15— 20 27 
20— 30 29 
30— 40 9 
40— 50 8 
50— 70 4 
70—100 8 
100. and above 1 

485 


(В. Com., Aligarh) 
(Ans. М=6.7) 


63—Name the different averages used in Statistics and 
explain how they conform to the requisites of a good average. 
Also mention the situation in which each of them would be 
appropriate. 

Obtain the mean, median and the mode of the following 
distribution. 
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Marks Frequency 
10— 25 6 
25— 40 22 
40— 55 44 
55— 70 26 
70— 85 8 
85—100 1 
(M.A., Agra) 


(Ans. a—47.6; M=48 ; Z—48.25) 


64—The monthly incomes of ten families in rupees in a 
certain locality are given below. Calculate Arithmetic mean, the 
geometric mean, and the harmonic mean. Which one of the above 
three averages represent the figures best ? 


аршу MU Вай» (б, CDU „Бы Fon sso He TUA 
Income 85 70 10 75 500 8 42 250 40 86. 


(B. Com. Agra) 

(Ans. a=111.6 ; G.M.—55.34 ; H.M.—28.8) 

65—Draw a cumulative frequency graph of the following: 
distribution, showing the monthly wages of а group of workmen, 


and hence or otherwise calculate the values of (a) the mode, 
(b) the median, and (c) the two quartiles :— 


Wages in rupees No. of workmen 
20 8 
21 10 
22 11 
23 16 
24 20 
25 25 
26 15 
27 9 
28 
(T. A. 8.) 


66—Draw a cumulative frequency graph showing the distri- 
bution of marks in the table below and locate and measure the 
median and quartiles. 


Marks: Frequencies 

1— 5 7 

6—10 10 
11—15 16 
16—20 32 
21—25 24 
26—80 18 
81—85 . 10 
86—40 5 
41—45 


(B. Com. Agra) 
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67—Construct an Ogive from the following figures and read 
the following : (a) Median (b) Quartiles (c) One percentile. 


Cotton consumed in thousand 


68—Find out Mean :— 


candies 
0— 2 
2— 4 
4— 6 
6— 8 
8—10 
10—12 
12—14 
14—16 
16—18 
18—20 
over 20 


69— Calculate the Mean and mode in the following series— 


Class 


No. of Mills 

5 
18 
12 
11 
8 
4 
1 
8 
1 
1 
2 

(B. Com. Aligarh) 

Е 

5 

8 

11 

6 

8 

4 

4 

8 

1 

(Ans. а=12.87) 

Е 

2 

4 

9 

3 

28 

20 

4 


(Ans. а=8.8, Mode—7.8) 


70—А limited company wants to pay bonus to the members 
The bonus is to be paid as under : 


of its staff. 


Monthly salary Bonus 
Rs.. Rs. 
100 and not exceeding 120— 50 
Победе 5140—3460 
ана AOE 160— 70 
160, 4s » 180— 80 
dae и » ^ 200— 90 
200 » 220—100 
220 and above —110 
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Actual salaries of the members of the staff are given as under. 


Rupees—200, 180, 185, 195, 218, 187, 160, 250, 198, 
190, 168, 170, 178, 175, 140, 120, 148, 165, 
155, 145, 125, 110, 162, 130, and 150. 


What is the total bonus paid ? 
What is the average Bonus paid per member of the Staff ? 
(B. Com. Gujrat) 


(Ans. 'Total Bonus paid is Rs. 1990, and average bonus 
paid per member is Rs. 79.6) 


71—The following are the marks obtained by 25 Students in 
Statistics :— 
55, 5 10, 22, 80, 
40, 48, 50, 28, 20, 
82, 7, 18, 15, 85, 
15, 25, 87, 49, 29, 
88, 45, 26, 99, 87. 


Tabulate the above data in the form of a continuous series 
with a class interval of 10, one of the groups being 20—29. From 
the tabulated data, calculate mean and Median. Also calculate 
the median of the original untabulated data and explain why there 
is а difference between the averages of tabulated data and 
original data. 

(B. Com. Gujrat) 


(Mean—29.8, Median of Tabulated data 30 and of 
untabulated data=29) 


72—Determine the median, upper quartile and third decile 
of the employees of а firm :— 


Income No. of persons 
Below Rs. 50 25 
Rs. 50— 60 69 
Rs. 60— 70 107 
Rs. 70— 80 170 
Rs. 80— 90 201 
Rs. 90—100 112 
above Rs. 100 65 


(B. Com. Bombay) 
(Median=80.19, 998.61, Оз=71.41) 
73—Sample of 25 values has the mean 80 and standard 
deviation 5. А second sample of 75 values has mean 85 and 
standard deviation 3. Find the mean and the standard deviation 
of the combined sample of 100 values. 


(Combined Меап=83.75) 


тшшш 
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74—One of the State Governments as a measure of the 
Economie drive proposes to introduce voluntary salary cut 
according to the following slab system based an average monthly. 
salaries. 


Salary slab per month Proposed Salary cut 
(in Rupees) (in Rupees) 

Not more than 50 Nil 
Exceeding 50 but not more than 100 5 
$ 100 ў A 150 10 
a 150 P % 200 15 
3, 200 i P; 300 25 
б 300 А ж 500 30 
m 500 ds i 750 75 
* 750 f 2.1 1000: 5100 
» 1000 M » 1500 250 
» 1500 x » 2000 500 
Above 2000 750 


The average monthly salaries of the 40 members of а 
department during the year 1953-54 are— 

68, 48, 64 72, 125, 875, 420, 560, 790, 850, 

1020, 2800, 1750, 95, 180, 620, 865, 950, 255, 155, 
590, 1200, 1550, 145, 475, 90, 115, 165, 210, 820, 
470, 700, 160, 220, 140, 155, 210, 225, 810 and 
410. 

If all the members of the department volunteer to accept the 
proposed cut, calculate the total saving to the Government on 
account of this measure. 

(Bombay B. Com.) 


(Rs. 8425 per month is the total saving to the Govt.) 


75—From the table given below find the mode :— 


Marks | 15 | 6-10 11-15 | 16.20 | 21-25 | 26-30 | 31-35 | 36-40 41-45 


No. of | 
Candidates | 1 10 | 16 | 2 


(B. Com. Delhi and Agra) 
(Ans, Z—18.66 marks) 


CHAPTER 9 


DISPERSION AND SKEWNESS 


“There is another kind of little figure. It is the one that 
tells the range of things or their deviation from the average that 
is given. Place little faith in an average . .. . when those important 
figures are missing. Otherwise you are as blind as a man 
choosing a camp’ site from a report of mean temperature alone." 


Darrett Hurr 


The central value of a frequency distribution tells us 
something about the general level of magnitude of the distribu- 
tion. But it fails to give a complete description of a frequency 
distribution. There are other aspects of frequency distribution 
which are also important. One of them is the ‘dispersion’ 
‘scatter’, ‘spread’ or ‘variation’. This is a measure of the degree 
to which the items included in the original distribution depart 
or vary from the central value. “Dispersion or scatter, or variation 
or variability is relative to any typical value and is a measure 
of the extent to which the individual items vary. If the scatter 
about the measure of central tendency is very large, it is of 
little ‘use as a typical value. Measures of dispersion are also 
called the Averages of the second order. Such measure is 
very important as it is clear from this example, А village 
accountant asked the depth of a river, which he was to cross. 
He was told that an average depth of the river is 8 feet. 
Believing on this average he decided to cross the river. At one 
place the depth was 7 feet and was drowned. Thus variations 
from the average are also important. 


If all the items in a distribution are widely dispersed and 
there is no tendency to concentrate around any one value, then 
clearly no average can adequately summarise the distribution. 
The averages nevertheless provide only rather incomplete 
summaries of any frequency distribution, but it is also essential 
to know what form the distribution has. According to Prof. 
Neiswanger, "Two distributions of statistieal data may be 


1 M, Zia-ud-din— Practical Statistics. 
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symmetrical and have common means, medians, and modes, and 
identical frequencies in the modal class. Yet with these points 
in common they may differ widely in the scatter or their values 
about the measures of central tendency." 


The frequency distributions may differ with each other 
mainly in two ways :— 


(I) Their averages may be the same, but formation of the 
items may be very much different. 


(II) Their averages may be different but the manner of 
the distribution of the items may be the same. 


This is illustrated in the following example. 


Condition I Condition II 
Subjects Marksof Магкѕ оѓ | Subjects Marks of Marks of 
A B A B 
A 12 20 A 15 9 
B 13 20 B 16 10 
С 14 10 C 17 11 
D 15 10 D 18 12 
E 16 10 E 19 13 
Total 70 70 85 53 
Averages 
Condition I 
А 19—14 A ELIT 
в 8-14 B 55®—11 


Under first condition though averages of both the series 
are the same, there is a wide difference in the constitution 
of the series. Under second condition though averages differ 
there is uniformity in the constitution of the two series. 


Hence it becomes necessary to know how typical i.e. 
representative of the distribution, the average is ; whether most 
of the values are concentrated around that average or widely 
dispersed through the range. If the intermediate values 
throughout the range and their distribution can be described 
in some numerical form, a whole series can be summarised for 
comparative purposes in a few simple figures. The method used 
to this end produces measures of dispersion. 

17 
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In engineering problems measures of dispersion are often 
specially important. The amount of actual variability in 
dimensions of supposedly indentical parts is critical in deter- 
mining whether or not the components of a mass produced item 
really are interchangeable. The variability in length of life 
of light-bulbs may be even more important than the average 
if the bulbs are used in an inaccessible location and can be 
replaced only at regular intervals, 


A classical problem in the social sciences requiring the 
measurement of variability is the measurement of ‘inequality’ 
of the distribution of income or wealth etc. 


Methods of Computing Dispersion 


There are two methods of computing dispersion. These 
methods are :— 


(A) Numerical Methods— 
(i) Methods of limits 
(a) The Range 
(b) The Inter-Quartile Range 
(c) The Percentile Range 
(ii) Methods of Moment 
(a) The first moment of dispersion. (Mean Deviation) 


(b) The Second Moment of dispersion (from which the 
standard deviation is computed) 


(c) The third moment of dispersion. 
(iii Quartile Deviation 
(B) Graphic Method— 
(i) Lorenz curve. 


Methods of Limits 


The Range. This is an elementary measure of dispersion. 
This is usually defined as the difference between the largest 
and the smallest values of a distribution of series. Symbolically— 


R=Highest value—Lowest value 


Highest value—Lowest. value 


Coefficient of R= Highest value--Lowest value 


LLL LETH TES 
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Iliustration—1 


Find the Range and Range Coefficient of the following :— 
60, 72, 36, 85, 35, 52, 76 
The Highest value—85 
The Lowest value=35 
R=85—35=50 
85—35 50 


Coefficient of R=. — = 
85--35 120 


- =.48 
Illustration—2 


Find the Range and Range Coefficient of the following :— 


Value of variable Frequency 
2 A "n 8 
4 эг 3 10 
6 = ^2 25 
8 T y 47 
10 T ый 17 


Range—10—2—8 


10—2 8 


Coefficient of Range— 1612 = =.67 approx. 


Illustration—8 


Find the Range and Range Coefficient of the following :— 


Age in years M. V. | Frequency 
5—10 7.5 10 
10—15 12.5 15 
15—20 | 17.5 20 
20—25 22.5 5 
Range—22.5—7.5—15 
22.5—7.5 15 


Coefficent of Range— 22575 EU = 1 


Relative Measure of Range. Relative measure of range 
is obtained by the conversion of the absolute measure by 


comparing it with some standard measure like mean or 2 mean, 
or median or mode. 


1 Range should be computed by taking mid-points—See. 
А. В. Illervic—Statisties p. 121. 
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Illustration—4 к 


Find out relative measure of dispersion of range from the 
following data :— 


Marks M. V. F | mf С. 
10—20 15 5 75 5 
20—30 25 8 200 18 
30—40 35 15 525 28 
40—50 45 7 815 35 
50—60 55 3 165 38 
60—70 65 1 65 89 
70—80 75 1 75 40 
Total 40 1420 
smf 1420 
а= N = ВИ =85.5 
M= e AA = —20.5 th item 
1—1, 40—30 
= F (ш—с)=30-- 15 (20.5—13) 
=85 
1—6 15—8 
Z= а eS аан: ИЕ (40—30 
l4 эск 071—399 вв ( ) 
А =84.7 
Range—75—15—00 
3 75—15 
Relative measure of Range from а= "955 =1.7 approx 
75—15 
» » ” 2x 35.52 LIST » 
75—15 
” » M = 35 zl. 7 „ 
15—15 
” ” Z == ПВА жш. 5 
75—15 
Coefficent of Range erue PDT OT T 
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Advantages of Range 


1—It is simple to calculate and easy to understand measure 
of dispersion. 


2—It gives a comprehensive value for the data in that, it 
includes the limits within which all of the items occur. 


3—It has got a special application in the quality control 
measures as well as in the measurement of fluctuation of the 
data. 


Disadvantages 


1—It is very much affected by the extreme items. Range 
is misleading if either of the extreme values is an unusual 
occurrence. 


2. It is an unsatisfactory measure because items in a 
frequency distribution have a tendency towards concentration 
in the middle of the series. 


3—It requires only one extreme item at either end of the 
series to render it virtually valueless as a reliable indication 
of the data. It is possible to have two distributions with the 
same range, but whereas in the one the frequencies are fairly 
evenly distributed throughout the range of the independent 
variable in the other the majority of values or observations are 
concentrated about a single value. 


In short, dependence on the two extreme items renders the 
range most unreliable as a guide to the dispersion of the values 
within a distribution. Its chief merit lies in its simplicity. . 


Inter-Quartile Range. This is a measure of ‘partial range’. 
Inter quartile range is computed by deducting the value of the 
first quartile from the value of the third quartile. 


Q.R.=Q,;—Q; 

In this measure extreme values have no effect, but it cannot 
be said a representative measure because it leaves 50% of the 
frequency distribution—25% before the first quartile and 25% 
after the third quartile. 

Percentile Range. It is also a measure of partial range 


but mostly used in educational measurements. It is a difference 
between 90th and 10th percentile. This measure excludes the 
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lowest 10% and the highest 10%, taking into account the two - 
values between which the central 80% of the items occur. 
"Though it is the same measure as 1—9 decile range, but as it is р 
expressed in percentages, percentile range is more popular. This - 
measure have a shortcoming in that it does not make use of the — 
values of all the items, 


The First Moment of Dispersion or Mean Deviation 


Mean or average deviation is the average of the deviations 
of the items from the Median, Mean or Mode. АП the deviations 
in this calculation is taken positively. In other words algebraic 
signs are not taken into account, because sum of the deviations 
from the arithmetic average is equal to zero and from median 
it would be nearly zero if the series is moderately asymmetrical, 


Symbolieally— 


Gi) д= M -w — (from arithmetic áverage) 


Gi) Pes Median) 


sj 


(111) Fo — — (from Mode) 


Where д —Mean Pme, (read as Delta) 
Zzd-—Total of the deviations 
М =Number of items 


Coefficient of Mean deviation=(i) Z; (ii) Я citi) 2. 


Illustration—5 


Calculate Mean Deviation from Mean and Median of the 
following data :— 


yi 
Z 
° 
B 


CHARA WH н 
> 
© 


БА 
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Direct Method 
(Series to be arranged for calculating median) 
—_и__ц——А„АЭ/„— 
4х (а) 4х (М) 
$. №. ш | (56) (5%) 
1 45 —11 — 9 
2 47 — 9 — 7 
3 | 49 —7 — 5 
4 52 -4 — 2 
5 54 —2 0 
6 57 +1 +з 
7 57 +1 +8 
8 71 | +15 +17 
9 72 | 4-16 4- 18 
N=9 504 66 64 
zx 504 
а= Wo "tes =56 
ма 
M= = саны —bth item—54 
2 2 
ә (а) ed € 7.38 
хат 64 
сна ЕИ 
д (m) N 9 
733 
Coefficient да= д - 5€ .13 approx. 
m FEM 13 approx. 
» Qm == рр 


Short-cut Method. For aa, 0M 
Ма Р НИ met 


dx from assumed 
m mean 56 
45 — 5 
47 — 3 
49 — 1 
52 + 2 
5% + 4 
57 +7 
57 7 
71 +21 
72 T22 
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pdx 54 
ax xp 50. -g- =56 


dx4+ (total E 
Aes p А: Shot cc Mato 


Where x dx—sum of deviations from assumed average ignoring 
+ — signs 
TotalError—No of items greater than mean — 


No of items less than mean x 
difference between assumed and actual arithmetic 


average. 
aon sio ct dU os a P. aa 
9 9 9 


Short-cut method for дт 
Sum of items M— Sum of items М 


дт= N 


In the above case 
(57--57-{-11+-72) — (45447449452) 


5 N 
257—193 64 1 
= 9 =". 1 


Discrete Series (Direct Method) 
Illustration—6 


Compute mean deviation from the following data :— 


| 10.2 dx 10 fdx 


Size | Frequency | mf | c.f. | dx(a) | {Аха | m12| m&Z 
4 2 8| 2| —6> | —12.4 | —6 —12 
6 1 6| 8| —42|— 42 | —4 — 4 
8 8 24| 6 —2.2 | — 6.6 | —2 — 6 
10 6 60 12 | —0.2 | — 12 0 0 
12 4 48 | 16} --18 | + 7.2 | +2 +8 
14 8 42 | 19 | +8.8 | 11.4 | +4 +12 
16 1 16 | 20 | +58 | + 5.8 | +6 +в 
Total 20 204 | 48.8 48 
| +— + — signs 
signs ignored 
| | ' ignored 


| | 
———————————————————— 
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f 
N 20 
20--1 
M= NAT = Ө —10.5th item—10 
2 2 
Z—10 
pdx 48.8 44 
= =_=“. di am =—>.28 
да N 20 2.44, Coefficient 102 
pdxm 48 р 4 
=>— =. —=——-=.24 
om N 50 2. 4, Coefficient 1б 
Xdxz 48 2.4 
= > ^mm ——— La i = —.24 
0Z N 20 2. 4, Coefficient: 10 


Short-cut Method 


dx ((from assumed 

Size Е шеап 10 {ах 
= 2 —6 —12 
$ 1 —4 и 
8 8 =; — 6 
4 6 0 0 
2 4 +2 +8 
js 3 +4 +12 
36 A +6 +6 
20 + 4 
48 

fdx 4 

ATE RT ex rS ДУА 
+ = 10+ 5 


Rx fdx-L-total Error 


да E 


Where sfdx—Sum of deviations from the assumed average 
ignoring -+ — signs. 
Total Error —' Total Number of frequencies above the mean — 


Tota] number of frequencies below the mean Х 
difference between actual and arithmetic average. 
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....484- (10.2—10) (12—8) 


да 20 
ай 
20 20 
д m from short cut Method 
f №1 2041 
Median— ciis 5 —10.5th item—10 


Sum of the items > M—Sum of the item < M 
on= — s N 


(10Х2-Е+12х4-Е14х8-Ь16х1) — (4<24-6у14-884-10х4) 
г. 20 


(20-{-48--42-{-16)—(8--6--24--40) 
= ЖОЙ A TAC: O a 


“зеен але 


20 20 


(Note :—Median item lies in size 10, the total frequency of which 
is 6. Hence there are two items greater than the median 
and 4 less than the median. Median item is 10.5th, the c.f. for 
the third class is 6, and the frequency of the median class 
is 6. We have to advance 10.5—6—4.5 or 4 hence 4 items are 
less than median and two more than median.) 


Illustration—7 


Continuous Series. Compute the Mean deviation from the 
Mean, Median and Mode from the following distribution of the 
Scores of 50 college students :— 


Scores Frequency 
140—150 * DE ONE" 
150—160 m Mi S 
160—170 т, T TIO 
170—180 21 ros 
180—190 ү КҮН; | 


190—200 л г, 8 
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0`&6® 019 
609 | #'05-- 99 + 
д6 + | eoit 801-- 
те +| #0 + 9g + 
046 —| 46 — 08 — 
ёїї— | /^61— 801— 
8"811— | 4/68— 511— 

хр} 2хр (ur) 

LLL xp} 


+ —-—-— 


[m 


а а 
[ee 
iet 
81— 
82— 


(ш) 


SLI Xp 


0'839 


л + 


ctet 
+89 + 
089 — 
646 — 
8"01— 


(s) 
xp; 


885 


ssi+ 
ве + 
89 — 
вт 
6'96— 


(=) 
стат хр 


0998. 0° 
og 383 £ 
Lv в99т 6 
88 0918 SI 
05 0991 от 
от 086 9 
Y 089 Y 

| mons se 

us TUE 3 


S6L 006—061 
88т 061—081 
921 081—021 
991 021—091 
вет 091—091 
StL 091—071 
"NN. Hm 


pou? 3223 


268 


AN INTRODUCTION TO MODERN STATISTICS 


umf 8560 
SE Mas АИГ 
n м - 5 


m= №41 5041 


=ош =25.5th item 


M—l4- me (m—e) 


180—170 
=1704+ —yg (25.520) 


=1704 19 (5.5) 
=170458 


=178 approx. 


fi—fo 
Z=ļ4+ 5- 2Xf,— f= fat, №) 


18—10 
АО (180-170 
№ 36—10—9 ( ) 
=170+4.7=174.7 
ий 528 ose 
Э rdg a a 
= fdm 510 
mee MM IND 
А N -5 
314% 493 
Z= wn 20,86: 
N 50 


Short cut Method 


m M.V. f dx' 165 fdx' c.f. 
140—150 145 4 —20 — 80 4 
150—160 155 6 —10 — 60 10 
160—170 165 10 0 0 20 
170—180 175 18 +10 +180 88 
180—190 185 9 +20 +180 47 
190—200 195 8 --30 + 90 50 

50 m 
590 
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fdx 310 
ax’. = 16 2-171. 
N Бб 65-1-6.2—171.2 


fdx--Total Error 


Qa— N 
E 590- (20—30) (171.2—165) 
E 50 
590--(—10Ж6.2) 590—62 528 
E 50 =т= pg ee 
ju of items greater than M—Sum of items less than M 
ES N 
5 
М= T =25.5th item 
Em 180—170 
M=1,+- 1 (m—c)— 1704- —J31g — (25.5—20) 
=з Х5.5=113.05 
(17613-1185 94-1953) 
jne —(145Xx4-+155Xx6-4+165X10-+1713X5) 


50 


* The median is 173, the class 170—180 or 170—179 
will be split up in 173, and (9—3) —6—170-1-6—176) 


(2288-1665-4585) — (580--930-1650--865) 
e 50 


T 4538—4025 b 513 —10.26 


50 50 


Advantages of Mean Deviation 


1—It shows the significance of an average in the distribution. 

2—It is easy to calculate and easy to understand measure 
of dispersion. 

3—It is a calculated value. 

A—]It takes into account all the items of the series hence 
it is effected by every value in the distribution. 

5—It is less effected by extreme items. 

6—It can be computed by any average but deviations from 
the Median are the minimum. 
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Disadvantages 


1—It’s computation ignores the algeberaic signs. 


2—For the purpose of further mathematical treatment the 
mean deviation is unsatisfactory. 


The Second Moment of Dispersion. From the point of 
view of a mathematician, the practice of ignoring +, — signs 
before the deviation when computing the mean deviation is 
quite unjustifiable, and in consequence the mean deviation is . 
unsuitable for use in further calculations. On the other hand 
if signs are taken into account the sum of deviation will be 
equal to ‘zero’. This problem can be solved by squaring the 
deviations. Hence an average of the squares of deviations from 
the individual item is caleulated which is called Second Moment 
of Dispersion. Symbolically. 

x (dx?) 


2nd Moment of Dispersion— Ве. 


But the utility of this measure of dispersion is not very 
much. It is simply a step towards the calculation of standard 
Deviation, which has great utility in statistics. 


Standard Deviation. This is by for the most important 
of the dispersion measures. The mean deviation and the Second 
Moment of dispersion are nowadays only of academic interest 
and in practice have been replaced by the Standard Deviation, 
which enters into so many of the advance formulae. It is also 
called root-mean-square deviation. The standard deviation is 
connoted by greek letter sigma «œ То state briefly, the 
Standard deviation is the square root of the mean of the squared 
deviations from the arithmetic mean. The standard deviation 
is always calculated from the arithmetic mean. The formula 
for calculating standard deviation is— 


с = ve or у 


Where „— Standard deviation. 
dx?— Square of the deviation from the arithmetic A 
mean. (It may also be called (х—х)?. 

М— Number of items. 
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There are two methods of caleulating standard deviation 
(i) Direct Method and (ii) Short-cut Method. 


Coefficient of S. D= —2— 


Series of Individual observation 
Direct Method 
Illustration—8 
Caleulate the standard deviation of a wage earner's monthly 
*arnings given below. 
Month —1 28 45 6 7 8 9 1011 12 
Earnings—39 40 40 41 41 42 42 43 43 44 44 45 


— S.No | | JC dE LI ЕЕ 


1 39 —8 9 
2 40 —2 4 
8 40 | —2 4 
4 41 —1 1 
5 41 —1 1 
6 42 0 0 
7 42 0 0 
8 48 +1 1 
9 43 +1 1 
10 44 +2 4 
11 44 42 4 
12 45 +3 9 
N=12 504 238 
NELEMNa ца 
ЕЛА 12 
5 T _ 38 EUN .17=Вз. 1.78 (approx) 
on NT 
1.78 
Coefficient of с = = —.042 
a 42 


Short-cut-Method. The short-cut method avoids the labour 
of finding the arithmetic mean. This consists in taking 
convenient provisional mean or working mean and employing 
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the formula after adjustment. The formula for computing 
standard deviation by short cut method is— 
ducc wee uy 
c= NI —(а—х)° 


or 


Bum rh 
zdx? хах \2 


Ма 
T jade N(x)? 
FI N 
There is another short-cut formula where deviations are 
not calculated. It is 


> 
с ae (a) 
Where :— 
ydx2= sum of the square of the deviation from the mean. 

N= number of items. 
a= actual arithmetic average. 
x— assumed arithmetic average. 
dx—sum of deviations. 
x2— square of the items. 


Illustration—9 

Calculate the standard deviation by short cut method of a 
wage earners’ monthly earnings: :— 
Моп 1. 2 18^ 4155.06 7 8 9 1011 12 
Earnings—39 40 40 41 41 42, 42 43 48 44 44 45 


dx? X^ 
usui eni e en, 
1 39 =i 1 1521 
2 40 0 0 1600 
8 40 0 0 1600 
4 41 E E | 1 1681 
5 41 ЧЕЛН 1 1681 
6 42 +2 4 1764 
7 42 4-9 4 1764 
8 43 + 3 9 1849 
9 43 -ig 9 -| 1849 
10 | 44 + 4 16 1936 
11 |. 44 + 4 16 . |. .1986 
12 | 15 + 5 25 | 2025 
sew. 1o 
N—12 494 | 86 | 21206 
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pdx 
а=х+ ay En 


(юле мес (а х)з 


2r S dicam ў" 117—4 a T 
=1Л8 


=40-- zou 


5 dx? а 7 Xu T 86 (Gy 


(2e2 Var — ir d 


RUIT АОИ 
—1.18 


аира. а.а 
Е —x)? 
disc EE 


86—12(42—40)? B 
N TENIS 


=1.18 
(4)с = т — (a)? 
мези а Ул 


ї76717—1164 
= 21205 u A 


uiv Ee VES Pr 
Discrete Series 


Direct Method 
The formula for calculating с in discrete and continuous 


series is— 
By Direct Method By Shortcut Method 
Г Ee i s 
с = Vx ex) 


„а 


үн аюв 
z o 
с = 4! N EIER. 


= es (a)? 


18 
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Illustration —10 


Calculate the standard deviation from the following data :— 


m Frequency 
2 ác e 1 
4 2 
6 “4 8 
8 Bey + Б] 
10 Es 5 8 
12 С 2 
14 y А 1 
Direct Method 
x f mf. dx 8 fdx fdx? 
(1) || (9 (4) (5) 6 (4X5) 
2 T 2 —6 —6 86 
4 2 8 —4 —8 82 
6 8 18 —2 —6 12 
8 5 | 40 0 0 0 
10 3 | 30 +2 +6 12 
12 2 | 24 44 48 89 
14 1 14 4:8 +6 36 
Total | 17 136 160 
Ух! 136 —8 


o= ve - у ~ 9.4=3.006 


Short-cut Method 


x f [dx(6) | ах fdx? fx fx? 

2 1 —4 — 4 16 2 4 
га 2 —2 — 4 8 8 32 
6 8 0 0 0 18 108 

8 5 2 +10 20 40 820 

10 8 4 +18 48 80 800 
12 9° +6 +12 72 24 288 
14 1 +8 +8 64 14 196 
Total 17 | 34 | 298 1248 
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fdx 34 
а—х. —6 —8 
4 N + 17 
sfa _ 
(Qo = a fox —(a—x)? 


= VE c ace d 184—4 = 49.4 


215 


4 


=8.006 
sfd зах? 
(Dou N N ED ) 
228 be ) 2 ang ЧЕ 
= ры A TES, = \/13.4—4 = 3/9 
17 17 У M 
=8.006 
ж fdx?—N (a—x)* 
Bo= X — N 
E- [e a 160 ух 
N 17 
—3.006 
У fx? 
(4) с= 4 N —(a)? 
1248 
== —(8)? 
y 17 ‹ 


= /734—64 = /94 =3.006 
Note—All formulae of о by short cut method ғ 


re originally 


the same. They are based on different calculation-saving devices 


Continuous Series 
Direct Method 


Illustration—11 


Caleulate the standard deviation from the following data by 


direct and short-cut methods :— 


Marks above No. of students|Marks above ` No. of students 
0 150 50 70 
10 140 60 80 
20 100 10 2 
80 
a 80 } _(М. А. Alld.) 
(The series has to be arranged in class intervals) 
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Direct Method 


Marks |m.v.| f mf | dx 39.26 {ах 


fdx? 
0 —10 5 10 50 —34.26 —842.60 | 11787.4760 
10—20 15 40 | 600| —24.26 — 970.40 | 23541.9040 
20—80 25 20 , 500, — 14.26 — 285.20 4066.9520 
80—40 85 0 0| — 4.26 0 0 
40—50 45 10 450 | -+ 5.74 + 57.40 829.4760 
50—60 55 40 (2200) --15.74 -Е629.60 9909.9040 
60—70 65 16 |1040 25.74 4411.84 | 10600.7616 
70—80 75 14 |1050 85.74 -- 500.36 | 17882.8664 
Тоїа1 150 |5890 3, | 78069.34 
smf 5890 Меерим 3 
a стает ner 
«= fd? 78069.34 corer 
08 IRE e 
3 fdx 
a—x' =35 oe cn 4.26 
е + 150 я 
—39.26 


Qu {= —(a—x)? 


80800 
= 4 — (39.26—35)2 
150 bo ) 


= /538.67—18.15 = ~ 520.52 —22.8 


БЕ fax \ 
iam ке 3 dapes 


= oed = аг =)" ^/ 520.52 —22.8 
150 150 
s fdxXi—N(a—x)* 
N 
[80800—150(89.26—85)* 


(3)o= V 


= 520.52 


—22.8 
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погувтлә(т dag 


086606 00808 
03484 8694 00755 
00949 3557 ООРУТ 
000IGI £608 00091 
09303 3505 0001 
0 3551 0 
00351 9359 0005 
0006 0009T 
095 0006 
2хр} 


079 091 
099+ 0r FL 
087-- 08-2 9r 
008-- 05-Е OF 
оот-Е ord OL 
0 0 0 
005— 01— 05 
008— 05— OF 
008— 08— Or 

хр; ccXP i 


poop 244098 
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к= 
(4). = АЙ ~~—(a)? 


309350 - 


— (39.26)2 =22.8 


Step Deviation. In order to facilitate calculations some 
common factor may be taken in common in the deviations. In 
the above example 10 has been taken in common. In this case 
formula will be. 


"Ju cm) - (5 У xi 


Where i stands for the common factor 
in the above case. 


808 764\ 2 ee 
150 ^ \150 A 
= ү5.39—.18ҳ10 = \5.21=2.28Х10 
=22.8 


Charlier’s check of accuracy, Charlier has given a for- 
mula to check the accuracy of с. It is— 
xf(dx--1)2— fdx?_25 fdx—N 


In the illustration given above the caleulations can be 
checked :— 


dx F1 (хп) | f(dx- 1) 
—8--1|—2 4 40 
—2-r1| —1 1 40 —1086—808— 128—150 
—1-4-1 0 0 0 —1086—936—150 
04-1| +1 1 0 Hence calculations are 
+141] +2 4 40 | correct. 
--2--1| +3 9 360 
+8+1/+4] 16 256 
+441] +5] 25 350 
1086 


Sheppard’s correction for grouping. When calculating the 
mean and other measures in a continuous form of distribution, 
we assumed the mid-values to be representative of the values 


ысы чы „кы ЖАНЫ 
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of the classes. But this is not a correct assumption, although 
in the case of mean the errors present tend to offset each other 
because of the plus and minus signs. But the standard deviation 
so caleulated will be greater than what actually it should be. 
W. F. Sheppard has given a formula of adjusting that. It is 
called Sheppard's correction. 1t is 
Corrected o= „g? (uncorrected) — 15 (i)? 
where i represents the class interval 

In the illustration— the corrected g will be 


с (corrected) = ./ 22.8°— ү, (10)? 
A/ 519.84—68.89 
= 450.95 

—21.24 


Other Measures from the Standard Deviation 
(A) Variance :—The square of standard deviation 38 called 
variance, Symbolically. 
Variance— с? 


Variance plays a major part in statistical inference. 


(B) Modulus :— 


The square root of twice the sum of square of deviations 
divided by the number of items is known as the modulus. It 
is represented by the letter 'C'. Symbolically 


2 3(d)? 


(C) Coefficient of variation :—The coeficient of standard 
deviation multiplied by 100 gives the coefficient of variation. 


Symbolically 
c 
UN 0 
Cys á x10 


This measure is used in comparing the dispersion between 


two series. 
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(D) Precision :—It is the reciprocal of modulus. 


e 1 
Precision mer 


AE sen? 
N 
(E) Probable Error :— .6745% с 
"Typical Illustrations 


Illustration—12 


Three distributions each of 100 members and standard 
deviation 4.5 units are located with their arithmetic means at 
12.1, 17.1 and 22.1 units respectively. Find the standard 
deviation of the distribution obtained by combining the three. 
Combined mean— Баба tats 

Вы 
En (1005c12.1) + (1005€17.1) + (100 22.1) 
" 100--100--100 
1210--1710--2210 5130 
i 300 = 300 
=17.1 units 
Combined с = 82662-6. f£,d,?--f;d;?-- fads? 
+58 
d,—(12.1—17.1)— —5 

dg=(17.1—17.1) =. 0 

dg=(22.1—17.1)= +5 
4 (1004.52) -+ (1004.52) + (1005«4.52) + (100% —5?) 

» (100502) + (1005?) 
= 100-100-100 


Кыайыы аил ao 
= 800 


11075 


—6.1 units 
800 


Illustration—13 


In a cricket match Mr. Varma has an average of 26.0 runs 
with a standard deviation of 17.5, where as Mr. Wahi has an 
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average of 33.85 runs with a standard deviation of 15.05. In 
your opinion who is the more competent player ? 


Mr. Verma Mr. Wahi 
су.= ^ 100 C.V.— X100 
ns x ud 15.05. o0 
26 Х 755.85 7100. 
—67.3196 —=4446% 


The coefficient of variation for Mr. Verma's game is 61.3196 
while that for Mr. Wahis game is 44.46%. This shows 
Mr. Wahi is a competent player. 


Illustration —14 
The scores of cricketers A and B for 20 innings each are 


tabulated below. Ascertain which may be regarded as the more 
consistent player ? (M. Com., B.H.U.) 


Score | 50 | 51 59 | 53 54 55 56 | SL 
| nest. ШЕ 3 6 3 LU 
B^ hys | 2 3 etg 3 | а? [| ШӨ 


м dx (assumed (fj fdx, Зак? fa | fd ах? 
average 53) 


50 —3 мы) 1 —3 9 
51 228 0 0520 an c 8 
52 Ж, 0 оо 2 —2 2 
58 0 4 оо 6 0:50 
54 AT B USUS 8 8 +3 8 
55 +2 6 +12 24 АВ 
56 +3 ЗЕЕ Заре ИЗ 
57 +4 з +12 48 0 0 0 
Тоїа1 20 +38 111 20 +8 56 
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A B 
ax. Seni Dui =o аз х/б 
N N 
ag 584-5; 
=54.65 =53-40.4 
=53.4 
si MNCCUMET TRUE ED s ae JH ах? re e ü 
Du (ш 2 ic (2) 
20 20 20 20 
=V 5.55—2.72 = 2.8—.16 
= ,/ 2.88. =\/2.64 
—1.68 runs —1.62 runs 
c. Vac 1. 2100 СУ..= 92 22 X100 
i 68 Ev 1.62. 
5.65 1 - 7 58.4 ie 7 200 
—8.0796 —8.0896 


This shows that B player is a consistent player. 


Allustration—15 


In any two samples where the variates x, and x, are 
measured in same units, 


—36 (Summation) xx,?—49428 


х.—49 ( 3 ) 5х:2—71258 E 
compute the values of the Standard Deviation of the two 
samples. Ф 


What additional information is required to calculate the | 
coefficient of variation of the above two samples ? 


(B. Com., Luck) 
зах? 49458 4 
got the 1st sample— 4 қ = \ =_= 
За 71258 
f the 2nd sample— 42% _ .—88.08 
g 0: e ani р N 29 


с 
Coefficient of variation dog х100 
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Hence arithmetic averages of both the samples are more 
required to calculate the coefficient of variation. 


Illustration—18 


In any two series where dx, and dx, represent the deviation 
from an assumed average 100. 


х.=150; 4х, =100 у 4х,2=245320 
x4—200 ;4х.=250 3 йх„2-—43850 
(assumed average) A—100 


Caleulate coefficient of variation for the two series. 
Series I— 


a —x-4- 2 


dx, 180 
д = 100+ 50-1012 


" (22) (09) * Eig 


C.V.,— 2 х10о— 404 X100=39.9 
ay 


—1012 
Series II— ^ 
s ЕЕ 1004 ro —10125 
pdx? зах. ao (a 
са с зуы 200 
=14.75 


14.75 
=e 100—14.6 
С.У..= 72 x100— 10135 X 


Illustration—17 


A collar manufacturer is considering the production of a new 
style of collar to attract young men. The following statistics of 
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neck circumference are available based upon measurements of a 
typical group of college students :— 


Mid value inches No. of students 
12.5 oe "s 4 
18.0 а A 19 
18.5 => T 80 
14.0 5: s.s 63 
14.5 M ет 66 
15.0 se “> 29 
15.5 oe c 18 
16.0 АЕ a 1 
16.5 "m > 1 


Compute the Standard deviation and use the criterian 
(M + ) where с is standard deviation, M is arithmetic average, 
to determine the largest and the smallest sizes of collars he 
should make in order to meet the needs of practically all his 


231 — 62.0 136.00 


. fd —62 
а=х+- i4. B 


=145— .27 —14.23. 
x» fdx? sfdx 2 


= /59-07 = / 62 
= .72 inches 
using the criterian M +3 ¢ 


—— —— 
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=14.23 + 3X.72-+ 3 inches 
—14.23 + 2.164-.75 inches 


The largest size of collar—14.23-|-2.16-]-75—17.14" 
The smallers size of collor=14.283—2,16+ .75=12.82” 


Tllustration—18 

The value of the arithmetic mean and standard deviation! of 
the following frequency distribution of a continuous variable 
derived by using both arbitrary origin and scale one 185.8 and 
9.6 units respectively. Determine the actual class intervals. 


f dx/i fdx fdz? 
2 —4 — 8 82 
5 Егу —15 45 
8 "=? —16 32 
18 =] —18 18 
22 0 0 0 
18 +1 18 18 
8 +2 16 32 
4 +8 12 36 
80 Е =16 208 
а=х+- ХЕ =185.3 


16Xi _135.3 
о 


fdx? yox 
47 
деи ея 


Р 2082 16212 
(9.6) =- 802 


Which give x—136.5 and 1—6 
Deviation will be = —24,—18,—12,—6 and во оп. 


From (x) assumed average 136.5 
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Mid values will be 112.5, 118.5, 126.5 and so on. 


The mid values are E =112.5 


Class will be 109.5—115.5 
115.5—121.5 and so on. % 


Alustration—19 © 


Compile a table, showing the frequencies with which words у 
' of different numbers of letters occur їп the extract reproduced 
below (omitting punctuation marks) treating as the variable "9 
Ње number of letters in each word, and obtain the mean, median m 
and the coefficient of variation of the distribution :— М 


“Suecess in the examination confers no absolute right to 
appointment, unless Government is satisfied, after such enquiry 
as may be considered necessary, that candidate is suitable in all 
respects for appointment to the publie service." 


- s 
Number of Letters frequency c.f. dx(6) {ах  fdx? 
Gets Head ecu клей eS te tac p 


2 9 9 —4 —86 144 
8 6 15 —8 —18 54 
4 2 17 —2 — 4 8 
5 2 19 —1 — 2 2 
6 2 21 0 0 0 
7 4 cy IE НКБ, 4 
8 3 ВОО в. E 
9 3 ров 
10 2 33 MARR 32 
11 3 36 Sd Ais 7b 
o лара i EM Еа Н 

96 words —18 858 

M= T: th item CET =18.5 th item letters 


fdx —18 . “Ж 
а=х+ ы = Té letters 
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ien Те Б pene ei be d А27 


=\/9.94—0.25 = /9.69 —3.12 


су 2 812 
SNL X100 = Eg— Х100 =56.7% 


Illustiation—20 


In two factories A and B engaged in the same industry in 
an area, the average weekly wages in rupees and the standard 
deviations are as follows :— 


Factory Average weekly wage S.D. Noof wage earners 


A 84.5 5 416 
B 28.5 45 524 


(а) Which factory А ог B Pays out the larger amount 
as weekly wages ? 


(b) What is the average weekly wages of all workers in 
the two factories taken together ? 


(c) What is the coefficient of variation in the case of 
each factory separately ? What inference do you 
draw from a comparison of these two figures. 


(M. A., B. H.U.) 


(a) Total wages paid by Factory A 
Average wageXNo of workers=Total wage Bill. 
=384.5<476—Rs 16,422 


Total wages paid by Factory B 
Average wageXNo of workers—Total wage Bill. 
—928.5X(524—Rs 14,934 


Factory A pays more wages. 


аз-аз ә 
(b) Combined Mean— —4—-;— eph 
ES 34.534416-1-28.55(524 
mE | 39.771 
_ 81856 


= 6 
1000 —Rs 31.35 
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(с) СХ. of Factory A= X100 


5 
= gpg Х100=145% 


СУ. of Factory B= 100 


45 
=? 4100 —15.8 
28.5 x d 


There is more stability of wages in factory A 
Illustration —21 


Caleulate the mean and the standard deviation of the 
following figures and state the percentage of cases which lie 
outside the mean at distances atio, а 90, a+ gg, Where ‘а 
stands for arithmetic average and с for standard deviation. 


115, 117, 121, 125, 116, 120, 118, 117, 119, 116, 
122, 124, 123, 118, 120, 118, 126, 127, 122, 123 


ББ аан ана ae a e 7 С^ 
x dx (120) dx? X dx (120) dx? 
В == 
115 —5 25 124 +4 16 
117 —3 9 128 +8 9 
121 +i 1 118 —2 4 
125 +5 25 120 0 0 
116 —4 16 118 —2 4 
120 0 0 126 +6 86 
118 —2 4 127 +7 49 
117 —3 9 122 +2 4 
119 —1 1 123 +з 9 
116 —4 16 
SAN 
122 +2 4 N=20 427 941 
T Е. 
y dx 
a=x-+ < ~_=120+ —120.35 


ai др" < e 
Vis AERE E 1 J-J1205—3125— J 11.98 
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=3.4 approx 
a- + о —120.35 + 3.4 
—123.75— 116.95 


Items lying outside these limits are, 
115, 125, 124, 126, 127— 5, X100 =25% 
а +2—120.835 4 23.4 
—127.15—113.55 


No item lies outside these limits 
a +3 „—120.85+ 3X3.4 
—130.55—110.15 
No item lies outside these limits 
Ilustration—22 
A group of 100 items has a mean of 60 and a variance of 


25. If the mean of 50 of these items is 61 and its g 4.5, find 
the mean and variance of the other 50 items. 


A А А 
j) 2 
N=100 50 50 
a= 60 61 ? 
vc— 5 4.5 ? Р Г 
Е а-а» 
Combined average— ~p yf 
Mg ous 
50x(61--50XX 5g 
504-50 "E 


3050-1-50x— 6000 
Б0х— 6000.—3050 


2950 
АЕО 


Ps 
— 59 а of—- 


mu M AE E 
2 2 a 
Combined = кезка. 
17772 
d4—(61—60) —-4-1 $ 
d,—(59—60)— —1 7 
NIIT 
j f,41s 
19 Ж. 
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(50x4. 52) + (50x64?) + (50x12) + (50x1 
100 ў 
1012.54.50, +50-450 
YR 100. лы 
2500 1012.5450,..2+-100 
—50,,s?=—2500-11012.5-1100 
60, ;2— 2500 1012.5 —100 
1387.5 


25— 


—5.2 approx 
Yllustration —22A 
For a frequency distribution of marks in History of 20 
candidates (grouped in intervals 0—5, 5—10 . . ) them г 
and standard deviation were found to be 40 d 15. Later it 
was discovered that the score 43 was misread as 53 in obtaining 
frequency distribution. Find the corrected mean and standard k 
Deviation corresponding to the corrected frequency distribution. 
(Т.А.8.) 
The Score 43 belongs to the class interval 40—45, while. 
the score 53 belongs to the interval 50—55. The mid point o: k 
these classes are 42.5 and 52.5 respectively. у 


the total yfdx by 10. 
Therefore actual total should be 40)<200—10—7990 


Hence the corrected mean— Qus 89.95 


21—200, ў —40 o —15 
zfdx?— (225-1600) х200—365000 
Clearly corrected y fdx? 
—365000-1 (42. DM (52.5)2 


—365000— (9510) | 
—964050 } k 


364050 /7990 \2 
200 "О, 
=14.975 approx 


Correct S. D.= . 


eA 
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Illustration—23 

The following table gives the marks secured by a class of 
100 students. Calculate the average marks and their standard 
deviation (i) By using the totals only (ii) By using the whole 
data. 


(Digits i.e. Divisions of class intervals) 


Marks ой ЛӨР | 814] 5] 6 a 8 | 9 | Total 
o9 | a) a т {a Гав 
10—19 513 4 2 1 15 
20—29 [d ‚| 8/10] 5 4| 3| 2 | 40 
30—89 | а воа 1 1 95 
40—49 | 4, | 8|2 2 11 


(Thus 2 marks are obtained by 4 students, 15 marks by 
2 students, 25 marks by 5 students and so on). 


(i) By using totals only :— 


dx dx fdx. | вах 
Marks МУ. Е. (24.5) | i (10) 
0— 9 4.5 1254 05990771 29b. |'—9% (зв 
10—19 14.5 15 то iw i " 
20—99 24.5 40 
30—39 34.5 22 +10 | +1 +22 8 
40—49 44.5 11 4-20 | +2 | 422 | 4 
Total 100 | | 45 | 129 
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xc 4388 уд 


—245-- oy X10 —25 Marks 
apie Aes + C) 


Kal - (ra ) ru 
—\/1.29—.0025%10 —\/1.2815Ж10 —11.3 Marks. 
(For calculation see page 293) 


a—x 2T fdx i 


77 
=25+ 100 


—29Б— ЛТ —24.23 Marks. 
|уїах? (к Д 


SENTNI N 
Е gy 
S TOO 100 
—^\/189.17—.5929 


—\/138.5771 —11.77 Marks. 
IWustration—24 
(Adjustment of ‘Raw Scores) 


Three candidates in an examination had the same aggregate 
marks and were bracketed equal. Use the following data to 
determine whether or not this placing was equitable. 


Marks awarded out of 100 for 


Business 
Candidate Administration | Economics Commerce Total 
95 70 61 226 
B 69 83 74 226 
C 70 74 82 226 


ihe same order were 16, 12 and 11. (B. Com. Bombay) = 


In such instances the original marks must be expressed in 
standard measures before they are added together for further 


DISPERSION. AND SKEWNESS 


A16'81 | 
| 


$88 | 
$847 
216 
9911 
961 
ФТ | 
005 

018 | 
058 | 
Av 

58 | 


zP} 


LA 
Sr 118+ 
ss+ |6I-+ 
+s+ (81 
89+ |+ 
yr vrl 
ёї++ |ert | 
oz+ 101+ | 
06+ |6 + 
o+ 8 + 
+ |2 + 
s+ |+ 
| (95) 
xp} | xP 


сз ес © © са +52040: 


| 


Я 


= | вүзїо/, 
9$ 
P 
8 D 
ey 9I 
68 0 
48 | OL 
98 | 88 
+8 89 
88 9I 
58 6 
65 | 005 
| 
зүү! Рр} 


6+ |818 
s+ ГЕИ 
0 0 $ 
or— |t—| or 
91— |3 — | 8 
18— 3 a 5 
p— |t| 
а= |a] т 
05— | 01— | z 
(9) 
хрр | хр 


(‘рзиоэ gg погувлувпцү) 


86 
L6 
95 
TG 
£6 
66 
I6 
8I 
2I 


949 
889 
Sail 
992 
685 
00% 
6681 
9113 


| 0951 


Я |юнзур | хр 


Si- 


8h— 
gp— РГ 
92.— {20 
St— |9L— 
a pir 
05— | 05— 
89— 16— 
56— $6— 
09— 265— 
(22) 
xpp | xp 


КЕРИ 


i 


А ————————————— 


gr 
II 
от 
6 
8 
в 
Y 
6 
0 


SHIWI 


—: wyep exrue əy} Зщеп АЯ (4) 
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statistical calculations. These adjusted scores, sometimes called 
Z—scores, may then fairly be added together. 


(i) A's adjusted Marks. 


MARKS OBTAINED—MEAN MARKS 


95—55 40 Total 
i d Eoi a i on esa 1; 
Business Admn. E 3 1 
70—58 17 | 
i id ROLE a IN 4. 
Economics 12 ia 1.41 | 91 
Co 61—50 11 24:00 ] 
mmerce EX IT = LE — dd 
(ii) B's Adjusted Marks. 
à 69—55 7 
Business Admn. ——— =| —— =0.87 \ 
16 8 | 
83—58 30 | 
Е 1 жое усы см Без cr di Б. 
conomies i = 15 2.50 F 5.55 
Commerce ope AM ] 
decode 4 
(ii) C's Adjusted Marks. 
70—55 15 
Busi Le. ew rues 
usiness Admn. E 6 0.93 | 
14—58 2t | 
Economics : == 5. 
12 12 =1.75 F 58 
82—50 . 32 | 
Commerce D E. amie Ж 06 
ii ii ^ 1 


The adjusted marks show that the equitable order would 
be C, B, A. 


Note—This method is called standard deviation above 
average. It measures how much above or below an average of 
а particular observation is. 


Advantages of Standard Deviation 


1—1 includes every value of the distribution. 
' 9 Tt is itself the result of correct mathematical processes 
and thus further caleulations may be based upon it. 
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3—It is the best measure of dispersion and it is of very 
great importance for sampling theory. 


The Third Moment of Dispersion. This is not a 
satisfactory measure of-dispersion. It is rarely used in practice. 
Under this method the deviation of items from an average are 
cubed. Symbolically 

d)3 
T. M. of Dispersion— 2) 
N 

Quartile Deviation. Quartile deviation or Semi-inter- 
quartile range is half the difference between the third and the 
first quartiles. The formula for the Quartile Deviation is 


9:—9: — 225 @—@, . 9-9, 
2 and Из Coefficint is z 
Qs—Q: 
OF ard, 


Illustration—25 


Series of Individual observations 


Caleulate the quartile deviation and its coefficient from the 
following data. 


Months Monthly earnings 
1 А sy e 39 
2 > 40 
3 40 . 
4 wd 3, E 41 
5 Ms ы - 41 
6 42 
7 42 
8 43 
9 43 
10 44 
11 44 
12 Ж: Pa уз 45 
о, М th stoma 1211 _ 18 =8.25 th item 
4 : 4 4 
i h— Size of 3rd 
ZiBize of the sed О ee = E 
41—4 ; 
—404- 0 14025 


35 
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а лел th item с = ка item —9.15 th item 


, Нея 3 
"Size of the 9th еш. Size of 10th үе x of 9th itemx3. 


‚=48-- Pc redis 


243.15 


Q,—Q, 48:754025 3.50 
.D.— ‚2 =1.75 
S Les. 2 2 


Q,—Q, 43.15—40.25 3.50 


-= = = 1042 
9.49, —43.15--40.25 84.00 


‘Coefficient of Q.D— 


illustration—26 
(Discrete Series) 


Calculate the Quartile Deviation and its coefficient from P 
the following data :— ; 


Size Frequency Cumulative frequency 
2 2 2 
4 9 11 
(6 11 22 
8 14 36 
` 10 20 56 
12 24 80 
14 20 100 
16 16 116 
18 5 121 
20 2 123 

Q= аш th item son = o 


—81st item the value of which is 8 


3(N--,. 3(123--1) 372 
bra к О а 
—93га item the value of which is 14 
РО: Т 
[а а D:= 5 о =з 


QO, - 1058 008 


"Coefficient of Q. D.— д.9, = тұ 2959 E di 
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Illustrabion—27 
Continuous Series 


Caleulate Quartile Deviation and its coefficient from the 
following table :— 


marks A cf. 
0—10 8 3 
10—20 9 12 
20—30 12 24 
30—40 20 44 
40—50 8 52 
50—60 6 58 
60—70 6 64 
70—80 5 69 
№--1 69-1 0 
Q= Nia е КСЫ И 
4 адай! 
1,—1 d 
Q;—l4-— $ * x (Qi—c) 
30—20 
—204- a X (17.5—12) 


10 55 
= ' x 5.5—20 —20.L4.6 —24.6 
EAE Tus eun 


Q= SO item кыпын 219—52.5 th item 


3 4 4 74 
esr 
Ф®=һ-+ 2, X (Q9 


504 SM Z (625—52) —50.- 1955 —804- 82508 


Q,—Q, 508—246 262 4,5, 
Q. D.— 2 =—2 НЯ 


9—9: 50.8— 24.6. 26.2 74 
Coefficient of Q. D.— à rà = 5081246 = т 


84 


Advantages of Quartile Deviation 
i— This measure of dispersion is very easy to calculate. й 
2— The smaller the result given by Q.D. formula the less 15 
the dispersion of the middle half of the distribution about the 
median. 
Demerits of Q.D. 
1—It is based on the values of the two limits ie. О» and Qi- 
Tt does not therefere eliminate the risk of extreme items. 
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2—It provides no indication of the degree of dispersion of 
the distribution falling beyond ©; and Qı. Consequently some 
further measure is required which will indicate the dispersion of 
all the items. 

3—The Quartile Deviation is not suited to algebraic 
treatment. j 
Relation between Dispersion measure 

Though much depends upon the nature of frequency 
distribution, the following relationship may however hold good 
approximately in a moderately symmetrical distribution 
Q. D.— Zo 5 
“Mean Deviation — to 


Graphic Method of Dispersion 

Lorenz Curve. Lorenz curve is a cumulative frequency 
‘curve designed to show the deviations in the distribution of the 
particular subject of enquiry over the group of the data. This 
method of dispersion with the help of curves was devised by 
Dr. Max O. Lorenz a famous economic-statistician. He used 
this technique for measuring the distribution of wealth and 
income. 


Economists are very often interested in the distribution of 
something by size e.g. income wealth etc. It is often important 
to know how far such a distribution departs from one of equality 
or whether one distribution is more or less unequal than another. 
Lorenz curve helps in appraising and demonstrating the degree 
of inequality of income or wealth of a group of people. It is 
most commonly used to show inequality of income or wealth in a 
country and sometimes to make comparisons between countries 
or between different time periods. 


Illustration—28 


From the figures given below, draw a graph to show which 
group has greater inequality :— 


Income No. of persons No. of persons 
A—State B—State 
Below—500 6000 5000 
500— 1000 4250 4500 
1000—2000 3600 4800 
2000—3000 1500 2200 
8000—4000 650 1500 


Before drawing the graph, following caleulations will be 
made :— 


%0`00т 00081 0081 % 00Т 00091 
А %0'16 00991 0055 0:96 08881 
Е ГАУ 008% 9998 09861 
= 250'89 0086 003% 950"*9 09201 
: 950'86 0009 0009 Wg Le 0009 
2 — 
à Kouonbo13 ‘ON виозләй Aouenbaaz ‘ON 
A [2103 03 05 — 9Anw[num?) јо “ON 12303 01 % элеш) 
Я 21935 У 91815 


029 95001 0038 0098 0007—0008 
0081 9568 000€ 0095 0008—0005 
009$ %65 0035 0091 0005—0001 
0957 956 000т 0924 0001—00% 
0009 958 026 035 009 —0 
ѕиовләа [#101 `5}[8әп[зА ‘sy `5 

јо ‘ом | о} 0 әлцер ‘A'W әшоопү 

-nuing 
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These calculations are done as given below— 


(1) The mid-values of the sizes are found out and they are 
made cumulative, then taking the last cumulative item as equal 
to 100, percentages to different cumulative items are found out. 


(2) Similarly frequency distribution is made cumulative and 
taking the last cumulative item as equal to 100, percentages to 
other cumulative items are found out. 


The graph is prepared in the following manner. 


(1) The cumulative percentages of size are taken on x-axis 
and cumulative percentages of frequency distribution are taken 
on y-axis. 

(2) On the x-axis percentages are taken in the reverse 


order-starting from 100 to o. On the y-axis we start from o 
to 100. 


(3) O on x-axis and 100 an y-axis are joined by a line. 
This line is called line of equal distribution. 
(4) Then points are plotted and curve is drawn. 


(5) The greater is the distance between the curve and line 


of equal distribution the greater will be the dispersion. The 


nearer the curve is, the less will be the dispersion. 


100 LORENZ CURVE 


Income 


State—B 


100 90 80 70 60 5 40 зо 20 10 0 
Population 
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The above graph shows that there is greater inequality 
of income in state A as compared to state B. 


SKEWNESS 


One single statistical measure is unable to give a complete 
description of the series. An average represents the central 
tendency of the distribution and the dispersion measures the 
degree of variation around the central value. Skewness is 
another statistical measure which calculates the degree of 
symmetry. Most frequency distributions show a concentration 
of frequencies about the centre of distribution. Аз the variable 
increases, class frequencies increase to a maximum and then 
decrease. If the distribution is bell shaped it is symmetrical, if 
not it is skewed. The skewness may be positive or negative. 
The following graph indicates about this 


Symmetrical 


Positively skewed Negatively skewed 
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In a symmetrical distribution Mean Median and Mode 
coincide. As the distribution departs from symmetry, these 
three values are pulled apart. In a positive skewness mean will. 
be greater than the mode and in negative skewness the mean will: 
be less than the mode. Thus skewness of a frequency distribu- 
tion is the extent of divergence of the arithmetic mean or median. 
from the mode. Тве measures of dispersion fail to show the 
manner of the distribution of the deviation. Skewness shows 
the position in relation to the mode or the median and brings 
out the distribution from the symmetrical form of the curve. 
The measures of skewness. not only measures the relative or | 
absolute skewness but also show the side on which the symmetry 
lies. 


In order to see whether a distribution is symmetrical or 
Skewed, the following facts should be noticed. y 


1. Whether the Mean, the Median and the Mode are 
identical. 1 


2. Whether the sum of the positive deviations from the 
Median is equal to the sum of the negative deviations. 


3. Whether the pairs of such measures аз quartiles, deciles 
ete. are equi-distant from the median. 


4. Whether at the points of equal distance from the mode 
on the two sides, the frequencies are equal. 1 


5. Whether the series plotted on a graph paper will give 
a normal bell-shaped curve. 


If it happens to be such there will be no skewness in the 
distribution. 
Measures of Skewness. There are mainly three methods of — 
calculating skewness. They are— 


1. The position of Averages Method 
2. The Quartile measure of skewness 
8. The cubed Deviation method 


These measures of skewness may be absolute. For com- 
parison these absolute measure is changed to relative measure, 


which is called ‘coefficient of skewness represented by ‘j’. 


1. The Position of Averages Method. The following | 
formulae are commonly used to measure the skewness. 
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(i) Skewness—a-—Z, and coefficient of skewness 


: а—7, а—7. 
ог j= — or 


$. 
(ii) Skewness— a—M, and coefficient of skewness 
exiis a—M - a—M 


(iii) Professor Karl Pearson has given the following 
formula for deriving coefficient of skewness. This formula is 
very much used in practice. It is 


, __а—2 
с 
If in a particular frequency distribution it is difficult to 
determine precisely the mode, a variation in this formula is done. 
In that case 


This is based on this equation— 

Z—a—8(a—M) 

or a—Z—3(a—M) 

Illustration 29 (Discrete series) 


Find out the coefficient of skewness for the following 


Wages in Rs. No. of Men Wages in Rs. No. of Men 
4.5 35 8.5 125 
5.5 40 9.5 87 
6.5 48 10.5 48 
7.5 100 11.5 22 
m, f dx (7.5) fdx fdx* 
4.5 85 —8 —105 815 
5.5 40 —2 — 0 160 
6.5 48 —1 — 48 48 
7.5 100 0 о | 0 
8.5 125 +1 +125 | 125 
9.5 87 +2 +174 348 

10.5 48 +3 +129 | 887. 
11.5 22 +4 + 88 | 352 


Total 500 | 4988 | 1785 
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3 fdx 
a=x4” WO =" ies 754.566 =8.066 


am 
Mode—8.5 s 
eR. Bury E 
1735 283 \2 LO TON 
= N= — [zn = /®47Т—. 
500 (т) и 
= J3.15=1.77 
Ju aci SOE сызышы 
i 177 1.77 


Jllustration—30 (continuous series) 


[D 


Find out a coefficient of skewness for the following 
distribution : d 


Variable Frequency Variable Frequency 


0—5 2 20—25 21 
5—10 5 25—30 16 
10—15 7 30—85 8 
15—20 18 85—40 . 3 
Variable M.V. | F dx 17.5 fdx fdx? 
0— 5 2.5 2 —15 — 80 450 
5—10 7.5 5 —10 — 50 500 
10—15 12.5 7 — 5 — 35 175 
15—20 | 17.5 13 0 0 0 
20—55 225 |91 ES +105 595 
25—80 | 27.5 16 +10 +160 1600 
80—35 | 32.5 8 +15 +120 1800 
85—40 ‚ 87.5 3 +20 + 60 1200 
Total i 175 | +330 6250 
а n 
a—x-L 2 ed by 5+ 7 —17.5.-4.4 —21.9 


Mode is located in the class 20—25 


f— 


Ж: „—1 
7—14 oF, peer 0—1) 
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i-e S 
pues comi sm tee 
—20. A Ж5=20-- 4028.07 
ve Eas ec ) Ме _ „380 ) 


= Myr = ,/ 63.97 —8 approx 


а—7, 21.9— 23.07 1.17 
jac WA = ee 
т 
Quartile Measure of Skewness 
Quartile measure of skewness is —Q,--Q; —2M 
Q«--Q,——2M 
9—9. 

The above measure of skewness has been proposed by the 
late Professor Bowley. This is based on the relative positions 
of the median and quartiles. If the distribution is symmetrical 
then 9; and 9, would be at equal distances from the median. 
Thus (Q;—M) — (M—Q,)—0 or 9.+9,—2М=0. The measures 
of skewness suggested by Bowley and Pearson are not comparable. 


and its coefficient —j— 


Jllustration—31 


Find the Quartile measure of skewness of the following :— 


Variable No. of students C.F. 
0—10 15 15 
10—20 20 35 
20—30 25 60 
30—40 24 84 
40—50 12 96 
50—60 31 127 
60—70 71 198 
70—80 52 250 


M= риле забы EE —125.50th item 


1 

Q= НЫ vans ET —62.75th item 
041 

а= TED n item idea —188.25 item 


—30 
— (6275—60) 
24 


hat 
Q,—h4—, ' (а—):=30-+- 


—30. 19 2.75 —30--145 —3115 
20 
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1281 10—60 
Ф=һ-Е-———(ч—е)=60-- ay (188.25—127) 


que. 10 (61.25) lo B 6 


0 (125. 5—96) 


СЮТ За 504+ 5 


Lg 19x29.5 E =59.5 
Skewness—Q,+Q,—2M 
=68.6+31.15—2 59.5 ——19.25 


йя 


Coefficient of skewness—j— EM. 
9—9; 
.Q686.-3115—119 — —19.25 __ 5 approx. 
68.6—31.15 36.45 


Note :—The cubed deviation method of skewness will be - 
dealt with in the next chapter. 

The coefficient of skewness is useful because it gives a 
precise number which is easy to interpret. If the series has a 
symmetrical distribution, the measures and their coefficients are 
equal to zero. It shows lack of skewness. Karl pearson’s 
coefficient of skewness will generally lie between +3. The 
Quartile measure of skewness will generally vary between 1 
Tf it is in minus then there is negative skewness and if it is in 
plus, then there is positive skewness. If j is less than .1 it 
should be regarded not very significant and if it is .3 it is 
regarded as significant. 


Theoretical Questions 


1—What is meant by dispersion? What are the methods 
of computing measures of dispersion? Illustrate the practical 
utility of such measures. (М. Com. Alld.) 


2—What is meant by skewness ? How does it differ from 
dispersion ? What is the practical utility of these measures ? 


3—Discuss the various methods by which the differences in 
the characteristics of frequency distribution are generally 
measured. (B. Com. B.H.U.) 


4—What do you understand by dispersion ? Explain the 
various methods of its measurement and point out their advantages 
and disadvantages. : (B. Com. Luck. d 


5—Discuss the relative merits of range, standard deviation — 
and mean deviation as measures of dispersion. ' (М. A. Alld.) | 
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6—Show how measures of dispersion help in explaining that 
though frequency distributions may have the same values of their 
averages they may differ in their respective formation. In what 
respect are measures of dispersion of use in statistics ? 

7T— Average, measures of dispersion and skewness are 
complementary to one another in understanding a frequency 
distribution. Elucidate this statement giving illustrations. 

8— Frequency distributions may either differ in the numerical 
size of their averages though not necessarily in their formations 
or they may have the same values of the averages yet differ in 
their respective formations. 

Explain and illustrate how the measures of dispersion afford 
а supplement to the information about the frequency distributions 
given by the averages. (M. Com. Raj.) 


Practical Questions 
1—Yield of sugar cane in tons per acre on 20 farms in the 
U.P. was as follows :— 
18, 15, 28, 20, 17, 23, 16, 16, 20, 19, 19, 95, 
16, 13, 21, 23, 21, 27, 18, and 22 
Calculate the Standard Deviation. (B. Com. Agra & B.H.U.) 
(Ans. с =3.889) 


2— Calculate the Mean Deviation and the standard deviation 
from the following data :— 


Exceeding Not exceeding Frequency 
7.5 8.5 2 
8.5 9.5 4 
9.5 10.5 5 
10.5 11.5 7 
11.5 12.5 9 
12.5 13.5 8 
13.5 14.5 1 


(Ans. 3M—133, o=1.49) (В. Com. B.H.U.) 


4 8—Calculate the Standard Deviation from the following 
ata :— 


Size of item 6 if 8 9 10 11 12 
м Шы oe ST е ү ш 
Frequency 3 6. 9 13 8 | 5 4 
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4—Calculate (a) Median coefficient of dispersion and (b) 
Mean coefficient of dispersion from the following data :— 


Size of item 4 6 8 10 12 14 16 
Frequency 2404 52092. г" 


(Ans. Coef. of 5n=0.40475, Coeff. of 8,—0.34) (М.А. Agra) 


5—Find the median, mean deviation and standard deviation for 
the following distribution of tomato plants :— 


No. of ‘Tomatoes No. of 
per plant plants 
0 2 
1 5 
2 7 
3 11 
4 18 
5 24 
6 12 
7 8 
8 6 
9 4 
10 3 


(B. Com. Bombay) 
(Ans. M=5, дт=1.68, о=2.09.) 


6—Find the Mean and Standard Deviations of the distribution 
given below :— 


No. of accidents Persons having said number 
of accidents 

0 15 

1 16 

2 21 

З 10 

4 17 

5 8 

6 4 

7 2 

8 1 

9 2 

10 E 
11 ` 

12 2 

Total 100 


(Ans. 7472.85, o=2.65) (B. Com. Bombay) 
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7—Compute the mean and the Standard Deviation from the 
following data drawn randomly— 


Monthly Expenditure Number of Students 

on Food and Luxuries 
78—82 8 
78—77 6 
68—72 T 
68—67 12 
58—62 17 
53—57 13 
48—52 9 
43—47 7 
38—42 4 
33—87 2 
28—32 1 


(М. Сом. В. Н. 0.) 
(Ans. а=58.27, о —11.82) 


3—А manufacturer of collars supplies to you the following 
prc regarding the neck-circumferences of the students of 
. H. U. — 


Neck-circumference No. of Students 
in inches 
12.0 5 
12.5 20 
13.0 30 
18.5 48 
14.0 60 
14.5 56 
15.0 87 
15.5 16 
16.0 3 


Calculate the standard Deviation and advise the manufacturer 
as to the largest and smallest size of collars he should make in 
order to meet the needs of most of his customers (by using the 
criterion Mean + 3 Standard deviation) 

(B. Com. B. H. U.) 


(Ans. a—14.01 inches, о =0.87 inches. The largest size 
of collar being 14.014-33(0.87—16.62 inches and smallest size 
being 14.01—3<0.87=11.40 inches). 


9—A distribution consists of three components with 
frequencies 200, 250 and 300 having means of 25, 10 and 15, 
and standard deviation of 8, 4 and 5 respectively. Find the mean 
and the standard deviation of the combined distribution. 

(M. Com. B. H. U.) 


(Ans. Combined а=16, o=7.2) 
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10—The following are the scores of 9 candidates in a 
contest. Calculate the mean deviation of the scores. 


Roll No. Score Roll No. Score 
1 68 6 88 
2 49 7 59 
3 32 8 66 
4 21 9 41 
5 54 
(Ans. à —12.8) 


11—Етош the following distribution of grades of 327 cadets 
calculate the Mean Deviation and state whether the distribution 
is normal. 


Grade No. of cadets 
68.0—69.9 4 
70.0—71.9 17 
72.0—73.9 39 
74.0—75.9 62 
76.0—77.9 58 
78.0—79.9 52 
80.0—81.9 35 
82.0— 83.9 22 
84.0— 85.9 18 
$86.0—87.9 13 
88.0—89.9 4 
90.0—91.9 2 
92.0—93.9 1 

327 


(Ans. да —3.64, The distribution is slightly skewed. Apply 
test of properties of Normal Curve). 


12—Calculate the Standard Deviation of the following data 
pertaining to wage distribution of 542 workers in an establishment. 


Monthly wage (Rs.) No. of workers 

20— З 
30— 61 
40— 182 
50— 158 
60— 140 
70— 51 
80— 2 

542 


(Ans. с =11.9 Rs.) i 
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13—Following are the means and standard deviations of an 
Achievement Test for two classes differing in size. Find standard 
deviation of the combined group. 


N a т 
Class А 25 80 15 
Class B 75 70 25 


(Ans. Combined a— 72.5, с =23.82) 


14—Find the total frequencies and the mean and Standard 
deviation of the whole group from the following data. 


N a с 
А 55 6.4 1.28 
B 45 6.6 1.62 


(Ans. N—100, Combined а=6.49, g — 1.42) 


15—What are the Mean and с obtained by combining the 
following three distributions ? 


Distribution N a c 
I 20 60 8 

II 120 50 20 

III 60 40 12 


(Ans. Combined a—48, ¢=18.05) 


16—Find the coefficient of skewness of the two groups given 
below, and point out which distribution is more skew ? 


Marks Group А Group B 
55—58 12 20 
58—61 17 22 
61—64 23 25 
64—67 18 13 
67—70 11 7 
(M. А. Agra) 


(Ans. Quartile j Group А =— 0.016, Group B 
—- 0.05. Hence Group В in more skew) 


l7— Compute. the quartile coefficient of dispersion and 
Skewness of the following array— 


Central size ло PRs в 04, 5955 40; 17; a 9; | 1и 
Frequenc 2, 9, 11, 14, 20, 24, 20, 16, 5, 2. 
B ЖЫЛЫ NM (B. Com. Agra) 


(Ans. Q;—4.143, M—5.75, Qs=7.15. 


Coefficient of Q.D.—0.27, and j—— 0.07. It is a continuous 
Series) 


18— The following table gives goals scored by two teams А 
and B in a football season :— 
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No. of goals No. of Matches 
scored in a match A. B 
0 27 АТ. 
1 9 9 
2 8 6 
3 5 5 
4 4 3 


Find the team which is more consistent in its performance. 
(B. Com. Saugar) 


(Ans. Coefficient of variation Team A=123.8% 
£r) „ В=109.0% 
Hence Team В is more consistent) 


19—From the prices of shares x and y given below state 
which share is more stable in value :— 


x 55, 54, 52, 53, 56, 58, 52, 50, 51, 49 
y 108, -107, 105, 105, 106, 107, 104, 103, 104, 101 
(B. Com. B. Н. U.) 


(Ans. c. v. of x Shares=4.992%, c. v. of y Shares— 1.90596) 
Hence y shares are more stable. 


20—Using {һе following statistical data, find out the 
percentage of cases that lie outside the limits indicated by 
ato, а 90, &+ За 

148, 145, 141, 116, 96, 91, 87, 89, 91, 91, 102, 95, 108, 120 
and 139. 

(Ans. a=110.6, с —21.82. outside the limit of ato 
==83.3% and No item lies outside a+2% and a+ 3с.) 


21—А collar manufacturer is considering the production of 
а new style of collar to attract youngmen. The following 
measurements relate to a typical group of college students :— 


Neck circumference No of students 
Mid value (inches) 

12.5 4 

13.0 19 

13.5 30 

14.0 63 

14.5 66 

15.0 29 

15.5 18 

16.0 3 1 

16.5 1 be yt 


a RE 
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Compute the mode and the standard deviation. 
(B. Com. В. Н. 0.) 
(Ans. Z—14.2875 and о —0.721 inches) 


22—Caleulate Karl Peason's Coefficient of Skewness from the 
following data :— 


Marks Number of students 
above 0 150 
m 10 140 
20 100 
30 80 
55 40 80 
B 50 70 
45 60 30 
» 10 14 
80 0 

(М.А. Raj.) 


(Ans. с =22.8, a—89.8, and M=45.5 j=—0.82 
Ustug the formula үа М ав the 7. io ill-defined) 


с 
23—Caleulate the Mean and Standard Deviation of the 
following data :— 


Age under No. of Persons 
10 years 15 
20 5з 30 
80, 5 53 
1 75 
50 „ 100 
60 Ei 110 
70 » 115 
80 y 12 
(M.A. Raj.) 


(Ans. а=35.16 years and о —19.7 years) 
24—Find the standard deviation and the coefficient of 
variation from the following data :— 


Wages No. of Persons 
Upto Rs 10 12 

„ , 20 30 

Во 65 

чело 107 

Ме БӨ 157 

3515541160, . 202 

» Su EO 222 

$515 1080 230 


(B. Com. B.H.U.) 
(Ans. с —Rs 17.26, C.V.=42.69%) 
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25—What relationship generally subsists between different 
measures of dispersion in a moderately symmetrical series ? 

If in a series which is not highly skewed the mean deviation 
is 9.8 feet, what would be the approximate value of its standard 
deviation. Also find the value of quartile deviation. 

(M.A. Punjab) 

(Ans, о —12.25ft. Q.D.—8.17) 

26—Write short notes on 


(1) Dispersion (2) Standard Diviation. Calculate the 
standard deviation from the following data : 


Size of item Frequency 

6 3 
7 6 
8 9 
9 13 
10 8 
11 5 
12 4 

Tatal 48 

(Ans. g—1.6) (B. Com. Nag.) 


27—Caleulate the mean deviation from the following data, 
what light does it throw on the social conditions of the community ? 
Difference in age between husband and wife in a particular 
community. 
Difference in years Frequency Difference їп years Frequency 


0— 5 449 20—25 109 
5—10 705 25—30 52 
10—15 507 30—35 16 
15—20 281 35—40 4 


(Ans. $a ==5.8) 


28—Calculate the standard deviation of the following two 
series. Which shows greater deviation ? 


Series А Series B 
192 .83 
288 87 
236 93 
229 109 
184 194 
260 126 
348 126 
991 101 
330 102 
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(Ans. o of A=51.6, C.V.=19.8%) 
с of B=14.96, С.У.=14.1%) 


29—The following table gives the number of finished articles. 
turned out рег day by different number of workers in a factory. 
Find the mean value and the standard deviation, of the daily 
output of finished articles and explain the significance of ‘standard 


deviation.” 
Number of Articles No. of workers 

18 3 
19 ч: 
20 11 
21 14 
22 18 
28 17 
24 18 
25 8 
26 5 
27 


4 
(B. Com. Cal.) 
(Ans. а=22.38, с =2.2.) 


30— Ета the Quartile and Mean deviations and the coefficient 
of skewness of the following data :— 


Height in inches No. of students 
58 15 
59 20 
60 32 
61 35 
62 88 
68 22 
64 20 
65 10 
66 8 


(В. Сом. Agra) 
(Ans. Q.D.—1.5, coeff. of Q.D.—0.01, sa=1.6, Q.j—.3) 


31—Given 
Class А Class B 
Number of students 84 60 
Mean Marks obtained by 
students 120 127 
Standard Deviation of Marks 14 12 


: Find out if the mean marks of class А are significantly 
higher than those of class B. 
(M. Com. Alld.) 


(С.У. of A=11.7% and of B 09.4% ; hence mean marks of 
А are not significantly higher than B) 


| 
| 
| 
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32— The following are the rents of 18 houses in a certain 
locality :— 


Rs. Rs. 
6=50 às ao 6—25 
5—00 И" 1% 3—00 
5—935 Lk "M 9—00 4 
5—50 в ir 4—50 
5—95 d Exi 4—00 
4—75 oo b. 5—00 
4—00 T2 ч 83—75 
5—00 “Ж са 5—00 
4—=50 c nA 8—00 


Calculate the mean deviation of this group. 
(B. Com. Luck.) 
(Ans. 8—Rs. 0.93) 


33—Find the coefficient of skewness of the two groups giver 
below and point out which distribution is more skewed ? 


Marks Group А Group B 
55—58 12 20 
58—61 17 22 
61—64 23 25 
64—67 18 13 
67—70 11 7 
(М. А. Арта) 


(Ans. j of Group A— —.02, of Group В= —.22) hence 
group B is more skewed. 


34—From the following frequency distribution of the size: 
of sales of tickets, caleulate mean deviation and its coefficient. 


Sale in Rs. No. of Sales 
0— 1.99 e rd 2 
2—- 8.99 &- ал 10 
4— 5.99 s En 26 
6— 7.99 Sls qn 32 
8— 9.99 Ln V. 8 
10—11.99 e "s 2 
80 


(Ans. §a=1.65, coefficient .27) 


35— Calculate the Mean Deviation and Coefficient of deviation 
of the following. 
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Consumption in K.W. Hrs. No. of users 
0 but less than 10 d 10 
TO о DATI id 25 
304 ВВ Hi 30 
30 „ » „ 40 A 20 
40 „ » » 50 E 15 
100 


(Ans. — 8a—9.8, Coeff. .37) 


36—Calculate Standard Deviation and semi-interquartile 
range from the following table giving the age distribution of 542 
members of the Parliament. 


OU RI b 
oo 
2 
- 29 


Age i No. of M.Ps 
40 132 
50 153 
60 140 
70 51 


2 
(B. Com. Nag.) 


(Ans. o=11.6, Q.D.—5) 
37— Find out the mean and standard deviation of the 
following data— 


| Age under No. of persons Age under No. of persons 
dying dying 
10 15 60 110 
20 30 70 115 
1 


80 58 30 125 
(М. Сом. Vikram) 


(Ans. a=85.08, с 19.4) 


38— Records are kept of the length of illness involved in two 
persons in two types of disease. The results. grouped in 14 days 
intervals are as follows :— 


Length of illness (days) No. of Patients 
A B 
0—13 5 3 
14—27 17 21 
28—41 14 7 
42—55 7 3 
56—69 5 1 
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Caleulate the mean and the average deviation around the 
mean for each distribution. What conclusions may be drawn as to 
the relative severity of the two diseases. 


(Gujrat B. Com.) 
(Ans. Mean of A—31.46, M.D.—13.65 
Mean of B—25. 7, МО. 9.58 
Hence the disease A is more severe than disease B.) 


39— Caleulate the coefficient of mean deviation and coefficient 
of skewness from the following frequency distribution. 


Size Frequency 

4— 8 os we 5 

8—12 "m os 8 
12—16 “+ i 18 
16—20 ots 25 30 
20—24 on qs 14 
24—28 es s 10 
28—82 ae n 8 
32—86 >. “+ 5 
36—40 + .. 2 


(Gujrat—B. Com.) 
(Ans. Coefficient of M.D.=.29, Coefficient of Skewness 
(Quartile) —.38) 


40—The following table shows the number of workers in a 
factory and their weekly earnings. Calculate the coefficient of 
Standard Deviation and Coefficient of Skewness. 


Range of weekly earnings No. of workers 

4— 6 4 v. 70 

6— 8 ze 350 

8—10 5c A 320 

10—12 24 1% 100 

12—14 ax Es 35 

14—16 vic Sr 20 

16—18 дыр, at — 

18—20 ss s 5 

900 


(Gujrat—B. Com.) 


(Ans. Coefficient of S.D.—.262, Coefficient of Quartile 
measure of Skewness—.094) 
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average deviation based on Mean. 


Size Frequency 


10 Zt "is sls 2 
11 К, oie os 7 
12 P » T 11 
13 3 d Un 15 
14 "t ® e 10 
15 E 

16 Va oe BA 1 


50 


(Ans. Z—134, Median—13, M.D.—1.08) (Gujrat, B. Сот.) 


49— The size, arithmetic mean and standard deviation of the 
income of two groups of workers are :— 


Group А Group B 


Size 30 25 
85.3 Rs. 
2.8 Rs. 


98.5 
3.2 


Arithmetie Mean Rs. 
Standard Deviation Rs. 


the two groups separately 


Find the coefficient of variation of 
(Bombay, B. Com.) 
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41—From the following data, calculate mode, median and 
and in combination. Comment on the result. 


3.28 
8.25 


С.У. of Group A— 
C.V. of Group B— 
Combined Mean=91. 3 
Combined S.D. — 2. 9 
С.У. of combined groups=3.17) 


(Ans. 
43— Find out the coefficient of variation of earnings from the 


following data. 


36 men get at the rate of Rs. 5.0 per man per day 


| 40 РА » m E 5.5 ” » 

90 ” m " " 6.0 ” „ 

138 Hj » 5 5540-0. m » 

| 80 » = ss s "0 » » 

| 61 " $ rie ҮК ab a b 

| 25 m S WEM 8.0 , » 
| (Ans. C.V.—1.8) (Bombay, B. Com.) 
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44— From the following figures find the Standard Deviation 
and the coefficient of variation :— 


Marks No. of person 
0—10 Dt UR 5 
10—20 Eie ac 10 
20—30 is w 20 
80—40 Ж а, 40 
40—50 2 JA 30 
50—60 ae 4 20 
60—70 У о 10 
‚70—80 ceu E 4 


(B. Com. Agra) 
(Ans. S$S.D.—15.6, C.V.—39.2) 


45—The means of two samples of sizes 50 and 100 respectively 
are 54.1 and 50.3 and the standard deviations are 8 and 7. Find 
the Mean and Standard Deviation of the sample of size 150 
obtained by combining the two samples. (В.А. Punjab & Luck.) 


(Ans. Combined Mean=51.57, S.D.—7.55) 


46—A sample of 35 values has mean 80 and S.D. 4. А 
second sample of 65 values from the same population has mean 
70 and S.D. 3. Find the S.D. of the combined sample of 100 
values. (B.A. Hons. Delhi) 


(Ans. Combined Mean=73.5, e —5.85) 


47T—Find out coefficient of dispersion and a coefficient of 
skewness from the following table giving wages of 230 persons. 


Wages in Rs. No. of persons 
70— 80 .. + 12 
80— 90 .. m 18 
90—100 sc. es 35 
100—110 .. wes 42 
110—120 СЯ or 50 
120—130 e: m 45 
130—140 .. .. 20 
140—150 “+ os 8 


(Agra B. Com.) 
(Ans. S.D.—17.8, j= —.3) 


CHAPTER 10 
MOMENTS AND KURTOSIS 


Moments, Moment is a familiar mechanical term. In 
mechanics ‘Moment’ is a measure of a force with respect to its 
tendency to produce rotation. The strength of this tendency 
depends upon :— 


(1) The amount of force 
(2) The place of force 


Suppose a yard stick AB (Marked in inches) which is free 
to rotate about some fixed point ‘x’ called the fulerum. The 
tendency to rotate depends upon the amount of force and distance 
from the fulerum! to the point at which the force is applied. 


A C x D B 


—ы—ы—- 


9 18" 


1 Kilogram 15 Kilogram 


The tendency of АВ to rotate around 'X' will be greater if the 
fcrce is applied between DB or CA rather than XC or XD. If 
we place a weight of one Kilogram at C, which is 9 inches to the 
left of fulerum or we place а weight of % Kilogram at B, which 
is 18 inches right to the fulerum, both will have equal force. 
In the language of mechanies the strength of the force can be 
measured by multiplying the amount of force and the length of 
the force arm, which is the distance from its excel. That is 


1 Kilox 9—16 Kilo 18” 
9—9 
The term ‘Moment’ is used т statistics in a quite analogous 
sense, the class frequencies looked upon as the forces. The size 
21 
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of each class frequency and the distance of each class mid- 
point from the origin are the factors of great importance. The 
moments of a distribution about any origin may be computed by 
multiplying the frequency of each class by a given power of its 
distance, along the x-axis from the origin, totalling the resulting 
products and dividing the total by the number of frequencies. 
If the first moment is desired, the first power of the x-distance 
is employed, if the fourth moment, the fourth power of the x 
distance and so on. 


This will be more clear by the following example. 


Class 

0—10 
10—20 
20—80 
80—40 
40—50 
50—60 
60—70 


8 
120 


20 
500 


35 45 
24 40 
840 1800 


Frequency 


8 
20 
24 
40 


55 65 


4 0 
220 0 
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Here force f; (4) is at a distance of 5 


» fo (8) 355. » 15 
» fa (20) » 2 25 
D f4 (24) » » 35 
» fs (40) „ ” 45 
» fe (4) » » 55 
> fy (0) » » 65 
Total moment about the оет—20-|-120-|-500--840-|-1800 
--220—3500 
Total force x £—100 
К, six 3500 ЖИП 
Moment coefficient = = = 1007 =85. This is the force 


exerted by the entire distribution at zero point. This is called 
“First Moment around the zero”. 


If the fulerum (origin) is the arithmetic average then yard 
stick will be balanced because sum of positive deviations will be 
equal to sum of negative deviations. 


Take the same case. (а—85) 


40 i 


`5 15. 95 35 45 55 65 


— 30 — 20 — 10 0 + 10 сю 
4 8 .920 24 
—120 —160 —200 0 {4% 0 + 5 
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f 
In statistics Mis represented by the Greck letter 1 (Mu) 
Thus 


lst Moment about Mean и, = EE 

2nd ] e а 
3rd эд из= E 
4th > phe t 
Nth < a, a Mo 


Here x—(X—x ) 
The Moment about zero are 


1st Moment about zero V, (NU) а 


z(fX?) 
=N 


z(fX?) 
N 


4th » nove =y 


_ 2nd 


3rd YR ni Vs = 


Nth 
Here X means arithmetic average 
(A) In all frequency distributions it is found that 


(B) In Symmetrical Distributions 
Bs z20 
us —0 
ит=0 
In the calculation of v, Vo, Уз it it not necessary to start 
from the origin. We can take any other value as the origin 
If deviations have been calculated from an arbitrary origin 
adjustments should be made before calculating the values of |. 
The adjustment formulae are :— 


2х sfx 
ШУ = “yo =0 
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quip fx 
(us=V2—Vi?= <a - 3 
из —=Уз—ЗУ.У.--2У!3 
318 _„ (хіх | [380 xix 
TANE NO E |+» US 


из — V4—4V,Va4-6V42V5—8V,* 


Illustration—t 
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Following are the measurements in inches of the lengths of 
320 leaves on a branch of a tree. Calculate the moments about 


the mean 


of the distribution of the lengths of leaves. 


Length in inches| # | dx fdz | fdx? | fdx® fdx* 
1.0 БО =з 15 45 | —185 405 
2.0 88 | —2 | —76 152 | —804 608 
3.0 65 |—1 | —65 66 | — 65 65 
4.0 92 | 0 0 0 0 0 
5.0 70 | +1 | +70 70 | + 70 70 
6.0 40 | +2 | +80 160 | +820 640 
7.0 10 | +8 | +30 90 | +270 810 

ое wot ИЙ езе 

Total | 320 +24 |--582 | +156 | +2598 
fx 24 
Vis yo o 0.015 
х (fx?) _ 582  , 1.81875 
Vae -N UE ОЕ 
3(#58) 156 _ 
Va= HR- = рз = 04805 
3 (fat) 2598 311875 
Ve = ар шщ 
24 24 
н, VOV 390 . 3027 
ua Va— Vg 
—181875— (0.075)? 
—1.813125 


из=Уз—ЗУзУз-Е2Уз5 
—0.4875——3(0.075) (1.81875) +2(0.075)8 
=+0.0791 
ua =V 4 4 Va -6V48V5—3V,* 
“—8.11875—4(0.015) (0.4875) 
-L6(0.075)2 (1.81875) —3(0.075)4 —8.083 
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Illustration—2 


The first four moments of a distribution about the arbitrary 
origin 4 are as under. 


Vi=1, Va—4, V4—10, V;=45 


Find out the mean of the distribution and calculate moments 
about the mean and also about the arbitrary mean zero. 


Arbitrary mean—4 
V= 
ҮУү:= x—Arbitrary Mean 
x=V,-+-Arbitrary Mean 
=1--4 
—б 


Moments about the mean— 
ti=Vı—V:=0 
и®=У„—У\? 


из=Уз—ЗҮУМ»-Е2У 3 
—=10—3(1)(4)--2(1)3 
=10—1242 
=0 
и. —V4—4V,V34--6V,?V, —3V,4 
—45—4(1) (10) 4-6 (1)2(4) —3(1)4 
—45— 40-248 
—26 
Moments about zero 
У, = x—0=5 
У»= из Ул? 
==34-5° 
=28 
Уз= u3 4-3V4V3—2V4,8 
=0+3(5) (28) —2(5)3 
—420— 250 
= 
V4— u4 4V1 Va—-6V:2V3-43V;4 
—26--4(5) (170) —6(5)?(28) -L (5)4 
—26-1-8400—4200-1-1875 
—5301—4200—1101 
Skewness based on Moments. In a symmetrical distribu- 
tion all the odd moments about mean i.e. Us, Ц, Ц, ete are 
equal to zero. If they are not equal to zero, iti means the distri- 
bution is skewed. But this is not a satisfactory measure of 
Skewness. Therefore in order to know the skewness, these 
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moments are related to the standard deviation. "Those coefficients 
are represented by the Greek letter (alpha). 


(alpha one) 4, = 10 
с 


(alpha two)«, = 42 
с 


(alpha three)«, — ^2 
т 


(alpha four) 44 = = 


Generally, is calculated in order to know the xpi eid 
because «4, is equal to zero. 


Hence measure of skewness «== 
g 


Positive and negative signs to be you into account. 
Another test is=p, (Beta one) — 43^ ut NES = Tus 
пә? 


If the result is ‘zero’ there is no skewness. No "definite 
upper limit is apparent for <, orf, but values as high as + 
indicate marked skewness. 


Measures of skewness based on the Third Moment or 
Cubed Deviation Method. This measure is ascertained 
from the cube root of the third moment of dispersion and is 
based on the principle that the sumi of the deviations from 
arithmetic mean, when the signs are considered, is zero. If the 
items in the given group are symmetrical at points of equal 
deviation, the frequencies are equal and so the result would be 
zero. If the deviations from the arithmetie mean are squared, 
allthe signs would become positive and the distinction between 
positive and negative deviations will not exist. Cubes of such 
deviations will not only preserve this distinction of the--and— 
signs but the items at the extremes would get relatively higher 
importance. 

The absolute measure of skewness is 


EL э ах 
Eu с: 
and relative measure of skewness will be 
[tas 3 | dex 
NN у А UN 
i= КАЙДУ seat or 8 


or 8/4, absolute measure of skewness 
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and 4, — 48 


u$ is the relative measure. 
Illustration —8 

Caleulate the coefficient of skewness based on the Third | 
Moment for the following distribution of weights of 225 


eandidates appearing for the Entrance examination of the Indian 
Navy Junior Corps. 


Weight in Lbs. No. of Candidates 
80— 85- 7 
85— 90 81 
90— 95 42 
95—100 54 
100—105 33 
105—110 24 
110—115 22 
115—120 8 
120—125 4 
Total 295 
97.5 
С1азз МУ. | F dx fdx fdz? {9х3 
5 
80— 85 82.5 Y E —21 63 — 189 
85— 90 87.5 81 —2 —62 124 — 9248 
90— 95 92.5 42 =f —42 42 —42 
95—100' 97.5 54 0 0 0 0 
100—105 | 102.5 | 33 +1 +33 33 +33 
105—110 107.5 24 +2 +48 96 +192 
TOSET t оваа Mat св 196 +594 
115—120 117.5 8 +4 +32 128 +512 
120—125 | 129.5 4] +5 | +20 100 4-500 
Total 225 +74 | +784 |-Ь1359 
SR 8 fdx? 
xfx +74 
V= N 729g =+ 0.3288 
xfx? 1784 ч 
Va N “22 —--3'4844 
fx? 21852 
У:= N^ — 555 = +6.008 
и1=У1—У!=0 


Po — V3— V,?—9.4844— (0.3288)2 
=8.8762 
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ps =Vs—3V1V2+2V13 
— 6.0083 (.3288) (3.4844) +-2(.8288)8 


—2.642 
I Measure 2nd Measure ' 
lx POSA 
Coeff. Skewness=V В 1= 85 SK— 3/2642 
: —1.382 
= 282 un pau. 
3.37628 : m 
2 3.3762 
Шы _ 1382 
1.888 
=+.75 (approx.) 
KURTOSIS 


Averages refer to the central tendency of a series. Dispersion 
refers to the spread of the items on either side of some measure 
of central tendency. Skewness refers to the lack of symmetry 
of a series. Kurtosis is another measure that refers to the 
*Peakedness' of a curve. According to Croxton and Cowden, 
“A measure of kurtosis indicates the degree to which a curve 
of the frequency distribution is peaked or flat topped.” А 
peaked curve is called *LEPTOKURTIC" and a flat topped 
curve is termed “PLATYKURTIC”. These are evaluated by 
comparison with an intermediate form called “MESOKURTIC” 
which is the Normal curve. These three curves differ widely 
in regard to convexity an attribute to which Karl Pearson referred 
as KURTOSIS'. The following diagram will illustrate :— 


Leptokurtic 


Mesokurtic Platykurtic 
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Student, a famous British Statistician has written an 
amusing sentence, which reads, “Platykurtic curves are like the 
Platypus, squat with short tails; leptokurtic curves are like 
the kangaroo, high with long tails noted for ‘lepping.” 

The measurement of kurtosis is based upon Fourth 
Moment. It is measured by 


My Ha 
fy а tuf ог A 


In a normal distribution B, will be equal to 3. If it is 
greater than 3, the curve is more peaked, if less than 3, the curve 
is flatter at the top than normal. 

Hence K=4,-3 = PES. 

If ‘K’ is positive, it-means that the number of cases near 
the mean is greater than in normal distribution. If 'K' is 
negative, the curve is more flat-topped, than the corresponding 
normal curve. 

Illustration—4 

Find out the degree of kurtosis in the following distribution 

of 82 families according to their monthly rent. 


Monthly No. of | Monthly Families 
Rent (Rs.) Families |Rent (Rs.) Families 
10—20 2 80— 90 7 
20—30 1 90—100 3 
80—40 2 100—110 1 
40—50 6 110—120 1 

50—60 16 

60—70 27 a 

70—80 16 Total 82 

65 
Class |M.V.| Е (| 9х | fdx? fdx? fdx* 

dx (10) 

10— 20) 15 2 —5 |—10 50 —250 1250 
20— 30| 25 1 —4 j— 4 16 — 64 256 
30— 40 35 2 a er АӨ 18 — 54 162 
40— 50| 45 6| —2 |-—12 24 — 48 96 
50— 60| 55 | 16 | —1 |—16 16 = 16 16 
60— 70| 65 | 27: 0 0 0 0 0 
70— 80| 75 | 16 +1 [4-16 16 + 16 16 
80— 90| 85 | 7 | -k9 1414, 928 | 58 112 
90—100 |. 95 8 +3 |+ 9 27 + 81 248 
100—110 | 105 1 +4 |+ 4 16 + 64 256 
110—120 | 115 | 1 | +5 |4 5] 25 | 4195 625 
82 0! 236 — 90 3032 
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(Note :—This is calculated from the actual arithmetic 
mean because sum of the deviations is equal to zero. But in 
order to verify the calculations Уу, Vo, Уз, V4, have been calculated 
and adjustments in шү, ug, из, и are shown. The results are 
the same) . vios 


Yi ae о 
_ хах? 286 _ 
Vo= Я —2.8780 
xfdx? —90 
————-— ——1.0 
Уз N 97 
xfdx* 3032 
Vi= е 
and— 
Ha =V Vg 
Big —Va— Vi? 
—2.8780— (0)? 
—2.8180 


из = Va ВУЗЕ 
— (—1.097) —3(0) (2.8780) -{-2(0)8 
—— 1.097 
п =Va— 4V: Va ВУЗУ —3У1* 
—36'9756—4(0) (—1'097)-6(0)2(28780) —3(0)* 


86.9756 
36.9756 
P imn а 2224.40, 
44 0r By 7 1,27 (2.8780)? 
K= «4—3 
It is more than ‘3’ hence the curve is more peaked or 
*Leptokurtic'. 


Illustration—5 

Following data is given to an economist for the purpose of 
economic analysis. The data refers to the length of life of a 
sample óf Good-year Tyres. Do you find that the data is 
Platykurtie ? 


N—100, zfdx—50, yídx?—1907.2 
sfdx®—2925.8, ‘sfdx*—=86650.2 
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5142 19072 


у= W^ = “Gog =+ 19.672 
zfdx* 29258 — 
V= ү = “00 ^ 29.258 
_ xídx* _86650.2 
у= = Io =+866.502 
Now 
и: = Vi—Vi=0 
шо =Уз—У\? 
—19.672— (.5)2 
—19.422 


из = Va—9V4V3-4-2V4? 
—29.258—3(.50) (19.672) --2(.50)3 


=0 
ug =Va 4ViV 5+ 6ViPV2—8Vi4 ` 
— 866.50 —4(.50) + (29.258) +6 (.50)2 (19.674) —3(.50)* 
—837.304 
и. _ 831.304 


NH T 2A 
8»7 4,3 (19422)? — 


it is less than 3 Hence the curve is platykurtic. 


Utility of Kurtosis. The measure of kurtosis is not of much ` 
use in Economic and social studies, because in these cases a 
normal distribution is usually out of question. But it has great 
importance in biological studies and studies relating to other 
physical sciences. 


'Theoretical Questions 


1—Define moments and discuss the method of calculating 
moments of dispersion about the mean. 


2—How would you calculate the value of a moment about 
the mean from the value of the moment about an arbitrary value ? 


8—What is kurtoris ? What purpose does it serve ? Is the 
study of kurtosis useful in economic and social sciences ? If not 
why ? 


4—(а) Distinguish between Skewness and Kurtosis. (b) 
How is Kurtosis helpful to judge whether a distribution is normal ? 
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Practial Questions 
l1—Find the first three moments for the following data :— 
f 


| m. 
| 72—74 7 
| 74—76 81 
| 76—78 42 
78—80 54 
| 80—82 33 
82—84 24 
84—86 22 
86—88 8 
88—90 4 
Ans. Taking 79 as assumed mean— 
) V4—-- .82889 
У.—=- 3.484 
У:=- 6.008 
ш1==0 
1238.876 
из—2.642) 


2—The first four moments of a distribution about x—4, are 
1.85, 10.5, 48.6 and 282.0. Find out mean and moments about it. 
(Ans. Mean=5.35, и1=0, ио=8.68, из=11.09, ша—124.25). 


3—Caleulate the value of kurtosis in the following series of 
marks in a general knowledge test :— 


Marks No. of Students Marks No.of Students 


59 1 69 16 
61 2 71 5 
68 5 73 2 
65 16 75 1 
67 52 
Total 100 
(Ans. 2.24) 
4—The following table gives the height of a batch of 100 
students. Find the value of kurtosis in the distribution. 
Height in Inches No. of Students 
59 0 
61 2 
68 8 
65 20 
67 40 
69 20 
71 x 8 
78 2 
75 0 


Р CHAPTER 11 
CORRELATION 


Correlation is the study of relationship between two variables. 
When there is a relationship of a quantitative nature between 
two sets of phenomena, the appropriate statistical tool for 
discovering and measuring the relationship and expressing it in 
а precise way is known as correlation. Some relationship is 
found in certain types of variables, for example there is relation- 
ship between supply and price, demand and employment, produc- 
tion and import and so on. It is the function of the measure of 
correlation to find and measure such relationship. According 
to Profs. Croxton and Cowden, “Correlation, also called Co- 
variation, is the causal relationship existing between any two 
variables depicting separate characters, the connection being that 
of direct cause and effect or of mutually reactive causes or the 
like, the coefficient of Correlation, being its numerical 
measurement," 

When two phenomena are varying simultaneously in the 
same direction or in the opposite directions, and the variation in 
the one is the cause of the variation in the other the two 
phenomena are said to be correlated. In the words of W.I. 
King, "Correlation means that between two series or groups of 
data there exists some causal connections.” At another place 
he says, “If it is proved true that in a large number of instances 
two variables tend always to fluctuate in the same or in opposite 
directions, we consider that the fact 13 established and that a 
relationship exists. The relationship is called correlation.” 

Connor states that, “If two or more quantities vary in 


Thus correla 
relationship between two or more related variables. 


There will be correlation between two variables if changes 
in both are in one direction or in opposite direction. If one 
variable increases or decreases the oth 
decrease. Such correlation is termed as Direct or Positive 
correlation: On the other hand, if one variable increases or 
decreases, the other moves in reverse direction, then there will 


er may also increase or 
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be Inverse or Negative correlation. For example when price 
increases, supply also шегеазез‘ог vice versa, therefore there is 
positive correlation between price and supply. But when price 
increases demand goes down or vice versa, here we find a case 


of negative correlation. 

The credit of developing this measure goes to two eminent 
statisticians—Sir Francis Galton and Karl Pearson. They 
studied many problems of Biology with the help of this technique. 
Measure of correlation also ensures that interpolation or extra- 
polation of variables will also be reliable. In Economics this 
technique has special uses. According to Prof. Neiswanger, 
"correlation analysis contributes to the understanding of 
economie behaviour, aids in locating the critically important 
variables on which others depend, may reveal to the economist 
the connections by which distrubances spread and suggest to him 
the paths through which stabilizing forces may become effective." 

Degree of Correlation. When changes in two related 
variables are exactly proportional, there is perfect correlation. 
If changes are not proportional the degree of correlation is 
limited. When there is equal proportional change in the same 
direction, there is perfect positive correlation between two 
variables, On the other hand if equal proportional change is in 
the reverse direction, there is perfect negative correlation 
between two variables. If there is unequal change in the same 
direction, correlation is said to be limited positive and if there 
is unequal change in the opposite direction, the correlation is 
limited negative. 


Karl Pearson has given a formula for measuring correlation. 
The result of this formula varies betweer + 1. In case of perfect 
positive correlation the result will be--1, and in case of perfect 
negative correlation the result will be—1. If result is ‘0’ there 
is absence of correlation. The following chart will show degrees 
of correlation according to Karl pearson’s formula. 


DEGREES OF CORRELATION 


POSITIVE 


M 
песа NO Low MODERATE HIGH 


HIGH MODERATE LOW 


Hic ео A-O +25 Has 
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Methods of determining Correlation. The different methods 
of fiinding out correlation are :— 


А— Старые Methods 
(i) Scatter Diagram or Dotogram, 
(ii) Simple graph. 


B—Mathematical methods 


(i) Karl Pearson's coefficient of correlation, 
(ii) Spearman's Rank Coefficient of Correlation, 
(iii) Coefficient of Concurrent Deviations, 

(iv) Least Squares method. 


Scatter Diagram or Dotogram. One can get some idea 
whether there is any relationship present in two variables by 
plotting the values on a scatter diagram. We measure the x— 
variable on the horizontal and the y— variable on the vertical 
axis and plot a point for each pair of x and y values. In this 
way the whole data are plotted in the shape of points. If these 
points show some trend either upward or downward, the two 
variables are correlated. If the plotted points do not show any 
trend the two variables have no correlation. If the trend of the 
points is upward rising from left bottom and going up towards 
the right top, correlation is positive. On the other hand, if the 
tendency is reverse so that the points show a downward trend 
from the left top to the right bottom correlation is negative. 


"The scatter diagram will take the following shapes :— | 


POSITIVE NO CORRELATION NEGATIVE 


CORRELATION CORRELATION 
In a scatter diagram, the coefficient of correlation will be 


high-positive if points are near the line shown in figure I below. 
On the other hand if points are near the line shown in figure II, 
the correlation will be high-negative. If points are away this 
line correlation will be moderate or low, as shown by distance 
from this line. 
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Perfect Positive Perfect Negative 
correlation correlation 


One of the serious limitations of scatter diagram is that, it 
shows only whether there is correlation or not between two 
variables. Degree of correlation cannot be known by this 
method. 


Simple Graph. Two series may be plotted on a graph 
paper, and by study of their direction and closeness, a rough 
idea about their correlation can be made. If two curves run 
parallel, then there is positive correlation, but if they run in 
opposite direction, then there will be inverse correlation between 
them. But by this method exact degree of correlation cannot 
be known. 


Illustration —1 


Find the correlation between the income and expenditure 
of а wage earner on piece rate system, working in a factory :— 


Months Income Expenditure 
in Rs. їп Rs. 
October 46 36 
November 54 40 
December 56 44 
January 56 54 
February 58 42 
March 60 58 
April 62 54 
May 66 58 


The graph indicates that there is positive correlation 
between income and expenditure. 
22 


888 AN INTRODUCTION TO MODERN STATISTICS 


s H CORRELATION GRAPH | 
7 
псоше 
60 
0 
г Expenditure 
40 
30 
20 
10 
ие 


Karl Pearson's Coefficient of Correlation. Karl Pearson's 
measure of correlation is based upon Arithmetic average and 
Standard Deviation. It is regarded the most satisfactory measure 
of correlation as the answer varies within + 1. +1 represents 
perfect positive correlation and—1 perfect negative correlation. 
The results between + 1 are interpreted as having limited correla- 
tion. The Karl Pearson's formula is based on assumptions :— 
(1) The two series sought to be correlated are affected by a 
large number of independent causes which bring about a normal 
distribution in the series. (2) The forces affecting the distribu- 
tion of items in the two series are related to each other in a 
relationship of cause and effect. (3) There is linear relation- 
Ship between both the series. This means that if points are 
plotted on a scatter diagram a line will be formed by the plotted 
points. 
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Karl Pearson's coefficient of correlation is represented by 
‘г’ and its formula is 


— ЭУ. би eee NY — 
“Моо; у [20:2 „58 vate xxdy 
Non OTs 


Where уху is the total of the products of deviations of x 
and y series, 


N=number of items 
c,—Standard Deviation of x series 
c,- Standard Deviation of y series. 


The other formulae given above are based on caleulation- 
Saving devices. 
Illustration—2 


Ten students got the following percentage of marks in 
Accountancy and Statistics. Find the coefficient of correlation 
and state your deductions :— 


Student 3 Тво 900 
Accountancy - 98 ra _82_ 90 | 62 | 65 | 39 
Statistics PE 86 | 58 | 58 | 47 
P 18% | 0 


(B. Com., B. Н. О.) 
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POLS T6566 0 099 2079 0 099 | OI—N 
T6Y-- 198 "61— A 949 95— 68 OI 
0 691 gI— 89 0 0 39 6 
v6 + T9 8 — 89 6 Bet 59 8 
00+ 007 0zd- 98 659 9&- 06 5 
90$ — 9I y — 59 686 Ar 58 9 
05 + p 5 + 89 oor ord- 94 в 
о? 98 9 — 09 0091 07— $5 vy 
938+ 359 sz+ 16 6801 egt 86 8 
ger 355 o> 169 178 6&— 98 [4 
pest PZE 8+ 78 691 (der 84 I 
=АР Ap Á SP xp x 
(99) е шолу (99) т шолу “ON 

Ах xp jo элеп uorjerAop SEN xp jo oxenbg uoreraAop 53789 

впотузтләр 

Jo 1әпрол quepnys 


А 521151335 


X Хойтуиподоу 
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Arithmetic average of x series— poaae vider 
N 10 
Arithmetic average of y series— zy sos ope 
N 10 
Standard Deviation of x series— BE. со 
=\/540.2 —23.2 
Standard Deviation of y series АЗУ А 2 
N 10 
—V8224 —149 
Sky he 2704 2704 
"= е, Т dOx232x149 34568 5 т 


Thus there is a high degree of positive correlation between 
marks in Accountaney and Statistics. 


By other formulae. 


а) = : Е QUI 
pit «мш, 5402 2224 
20.00. 
bó o 
7 10 /540.2»« 222.4 
_ 2104 
—10x23.2x14.9 
2704 
=з tO 
(2) 1—. 2 2104 2704 
Jsde уау 375402500294 18441-1 
2104 
БТ =-+0.78 
(3) 
This can also be solved with the help of log-tables. 
Уху 
па ола Ер d : 28 sdx?-Ll dy?) 
"Y Antileg {Log sxy—¥% (log xdx*J-log Ж@у°)} 
ОЕР ntilog {Log X А 


_ 2104 
— y/5402X 2224 
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=A. L. (3.4320 — 14 (3.7326-4-3.3472)} 
=A. L. of 1. 8921 =--0.78 


To facilitate calculations, when arithmetic average is not а 
complete figure or series are large, short cut method of calculating 
coefficient of correlation may be used. The formulae according 
to short-cut method ar2 


(1) 
oN PIER 

х ҮЙ -ES E-e] 
i су xl 
M. 
8) EXYXN-—(xdxx хау ) 
D 

A d[xdx* x N— (xdx )?] x (x(xsd? XN— dy )?} 
E yt XXy—N (a;—x,) (a5—X2) 


N 7109 


Where xxy—sum of the products of deviations of two series 
from assumed mean. 


pdx? andy dy?—3um of squares of deviations from assumed mean. 
Xdx andydy— Sum of deviations from assumed mean. 


a,—actual mean of x series 
X,—assumed mean of x series 
a5—actual mean of y series 
x4—assumed mean of y series. 


Illustration —3 


Caleulate the coeffieient of correlation between the marks 
secured by 12 students in two tests. 


Students АВ С ED EE ^B (БОГ КОЕТ, 
Test I 50 54 56 59 60 62 61 65 67 71 71 74 
Test II 22 25 34 28 26 30 32 30 98 34 36 40 


CORRELATION озо 


ie | Product 
Devia- | Devia- Product 


| | Devia- 

tions | Devia- tions tions of 
Student Test 1 | from | tions Test2| from |squared| tions 
assum. squared assum. devia- 

mean, 60 mean, 30 

x da da? Y | dy dy? ay 
А. 50 | —10 100 22 | — 8 64 + 80 
B 64 | — 6 36 | 25 | — 5 25 + 30 
END |—4 16 84 | + 4 16 — 16 
D 59 = 1l 1 28 =? 4 + 2 
E 60 0 0 26 | — 4 16 0 
E 62 | +2 4 30 0 0 0 
G 6l | ЕТ 1 32 | +2 4 4+ 2 
H 65 | +5 25 30 0 0 0 
1 67.1. 7 49 ов Hd Zee y 
J 71 | ti 191 | 34-| 4 4 16 | + 44 
K 7i | +11 121 a6 | + 6 36 | + 66 
L 74 | 414 196 40 | +10 100 4-140 
5х—= | 130 [58% = 45 |sdy= | 2r9— 
750 670 | 365 285 334 


_ 38215 — Š 7, 
12 /(55*83— 6" 25)х (28`15— 18) 
__821°5 821°5 4779 


19 /49°58;22' 57 7 401°4 
Tllustration—4 


Caleulate the coefficient of correlation between the values of 
X and y given below :— 


x x 

78 125 
89 137 
9T 156 
'69 112 
59 107 
79 186 
68 128 


61 d dy ASUNTOS i 
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(You may use 69 as working mean for x and 112 that 


for y): (M.A. Delhi, B. Com. Vikram) 

X  |dx(69)| ax. у | ау (112) | ау? ху 

78 | +9 er) | 195 | 3-18 | 169. | + 117 
89 | +20 | 400 | 137 | +25 | 625 | + 500 
97 | +28 | 784 | 156 | + 44 | 1986 | +1932 
69 0 0 112 0 0 0 
59 —10 100 107 — 6 25 + 50 
79 +10 100 186 + 24 576 + 240 
68 — 1 1 128 и 121 ELI 
81, lg 64 | 108 | — 4 16 |+ 89 
N=s | +48 |+1530|N=s | +108 | 3468 | +2160 


Xxy—N m г (22 ) 


= 
2160— -s(2—) s 
8 9 F } m [M zu h 


. 1512 — 1512 n. 
8\/155.25X251 15776 — 


*95 
Illustration —5 


Compute the coefficient of correlation from the following 
data. 


X Y 
1200 3600 
— 1000 3300 
— 800 2400 
— 400 1200 
1200 — 8600 
1400 — 2100 
— 600 1800 


i —1000 H 5 3000 
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00009£16—— 00007669 | 0 = 0075=5 0000005 о = o= 
= 8s=N 
0000045 — 000'06'54 0025 0008 000 00°01 0001— | 0001— 
000006 — 0000922 009т 0081 000'09'& 009 — 009 — 
0000988 — 000*09* 438 0075 — 0016— 000°09°6T 007т 007т 
000089} — 0000191 0068— 0098— 0000F FE 005т 0021 
000098 — 000“0т“8 006 0021 000091 оо — 007 — 
0000891 — ooo or? | 0018 0075 000°07°9 008 — 008 — 
0000008 — 000*0006 0008 0088 000'00f01 0001— 0001— 
0000897 — 00001291 0068 — 0092 — 00009 FI 0081 0081 
Ах AP (008) Ap £ Хр 0 
әЗеләлъ шолу Xp Xx 
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_ m5y _ —2,18,60,000 
"= Vdexdyi \/8000000х 59940000 — 
— 2136 T 
= 21898 —— 918 
Illustration—6 


The following table gives the distribution of the total 
population and those who are wholly or partially blind among 
them. Find out if there is any relation between age and 
"blindness. 


Age No. of persons Blind 
in thousand 

0—10 100 55 
10—20 60 40 
20—80 40 40 
30—40 36 40 
40—50 24 36 
50—60 11 22 
60—70 6 18 
70—80 3 15 


Note. In order to make the data comparable it is necessary 
to find out the number of blind out of a fixed number (a common 
unit). Here this unit will be one lakh. 


j LL: 5 е 
Аде da 59 5 | №. | в. | dy 
а т.о. тот dz | с 5 S| of | XŠ S | from dy’ ay 
series as av. Z 5.2 | blind = > las. av 
45 Š 150 
0—10 5 40 1600) 100 55 55 |—95| 9025 | 3800 
10—20 15  30| 900) 60 40 67 |83) 6889 | 2490 
:20—30| 25 |—20| 400| 40 40 | 100 |—50| 2500 | 1000 
30—40) 35 —10' 100, 36 40, 111 |—39| 1521 | 390 
40—50 45 0 o) 24 36| 150 0 0 
50—60) 55| 10| 100) 11 22 | 200 50| 2500 | 500 
60—70] 65| 20 400 6 18 | 300 150 22500 | 3000 
70—80) 75| 30| 900 3 15| 500 350|192500 10500 
п=8 —40 | 4400 n—8 283 |167435 121680 
уху М к 
М М 


ER M ЕТ ЕЕ 
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21680—8 e es 


aN NEU «d de 33 


23095 
а S =+'898 
8\/525 >< 19676 25703 


Illustration—7 


From the following data find out if there is any relation- 
ship between density of population and death rate :— 
District Area'in Sq. Miles Population No. of Deaths 


A 120 24,000 288 
В 150 75,000 1,125 
С 80 48,000 768 
D 50 40,000 720 
E 200 50,000 650 


(In this question we have to find out correlation between 
density of population and death rate. Therefore we should 
first calculate density and death rates. Density per sq. mile 
can be calculated by dividing the population by area. Death 
rate can be calculated by No of Deaths<1000 

Total Population - 


District Density dx (500) ax? D.Rate dy dy? xy 
x 


y (15 
А 200 —300 90000 do! 395991 ЧИГ 900 
В 500 0 0 15 0 0 0 
C 600 +100 10000 Jg. атра 100 
D 800 +300 90000 їз PET “0711900 
Е 250 —250 62500 lg ear CX 500 
N=5 2350 ° —150 252500 —1 23 2400 


SE UT 


күр” (в a У] 
—150 (=: | 
Pe б? 
2870 2310 
—5\/49600х4°56 — = 579 = 


= 
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Illustration—8 
Given :— 


Number of Pairs of observations of x and y Series— 15 
x Series arithmetic average—25 


с 51:28:01 
y Series arithmetic average—18 


с —8.08 
Sum of Products of deviations of x and y Series— 1-122 
Find out coefficient of correlation (B. Com. Allahabad) 


xy m 122 . 122 ева 
Nojoo 15X8 01X303 = 1368 — 

Correlation of Time Series, Historical data spread over a 
period of time constituting a time series, depict two sorts of 
fluctuations. 

(a) Long Term (b) Short term 

If two time series are correlated without making any change 
in the series, the resulting coefficient of correlation will include 
both long term and short term changes. However long-term 
changes will be more represented. If it is desired to study 
correlation of short period changes, the trend values will he 
eliminated by the method of moving averages. Moving averages 
will now constitute x and y series and a coefficient of correlation 
will be caleulated thereof. In such cases deviations are taken 
from the trend rather than from the arithmetic mean. 


Illustration —9 


T 


Following are the indices of supply and price. Compute r 
for short time oscillations taking 5 yearly moving average, 


Year Index of supply Index of Price 
1 91 117 
2 98 97 
З 95 102 
4 92 108 
5 93 105 
6 96 96 
7 102 7T 
8 107 68 
9 104 7T 

10 98 93 

11 100 89 

12 108 83 

13 116 78 

14 114 84 
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Lim ES 
NE хах x xdyi = \/37 228 
P —91 
T -8436 
ved tocca 
ДЖ 91'86 
=—0'99 


Correlation of cyclical fluctuations. For this it is necessary. 
to obtain figures of cyclical fluetuations. Formula for calculating: 
r is 

pas N 

As the values are already divided ру respective standard: 
deviation there is no need of dividing yxy by the product of 
standard deviations. 

Illustration—11 


From the standard deviation cycles of the production and 
price of coal for the period 1950—1960 find the coefficient of 


correlation :— 


Year Production е Price с 
1950 De — 0.06 +0.12 
1951 v —0.85 —1.56 
1952 E -+0.79 —0.32 
1953 E: +1.06 +0.16: 
1954 $ +0.94 -+2.79 
1955 y {2.40 —0.20 
1956 A +0.24 +0.12 
1957 D --0.87 —0.16 
1958 = —1.08 —0.72 
з 1959 us +0.43 —0.16 
1960 5 +0.46 50:19 


р (M. Com., Alld.,) 
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3 of x series 3 of y series | Product of S.D. cycles 

Year (2) y cy 
1950 —0.06 +0.12 j — 0.0072 
1951 —0.85 —1.56 + 1.3430 
1952 +0.79 —0.32 — 0.2528 
1958 -+1.06 -+0.16 4-0.1696 
1954 -]-0.94 - 2.79 -+ 2.6226 
1955 +2.40 —0.20 — 0.4800 
1956 +0.24 +0.12 +0.0288 
1957 +0.87 —0.16 — 0.0592 
1958 — 1.03 — 0.72 -+0.7416 
1959 +0.43 —0.16 — 0.0688 
1960 +0.46 j —0.12 — 0.0552 
-1-4.9056 
—0.9282 
X 3.9824 

уху 3*9824 


SONT = ee 
Product-Moment Correlation, If there is a small number 
'of items in the two series, their correlation can be found out by 


product moment method. The formula is— 


у Adxy -X y 
м (Azx? — x?) x (Ау? – у?) 


Where— 


Ax xy represents the arithmetic mean of the summation 
‘of the product x and y, 


x and y represent arithmetic averages of x and y series. 
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Axx? and А ж у? represent arithmetic means of summation 
of squares of the items of x and y series respectively. 


Illustration—12 


Caleulate the coeffieient of correlation by product-moment 
method :— 3 


2 а? y y? ey 
4 16 1 1 4 
6 36 2 4 12 
8 64 3 9 24 
10 100 4 16 40 
12 144 5 25 60 
14 196 6 36 84 
16 256 7 49 112 
370 812 _ 28 140 336 
Average 10 116 4 20 48 
х А zx? у А хӯ? ANay 
AXxy-X y 
J Axx? - x? x Азу? — y? 
48—10x4 
116—102: 20—4? 
NT 8 
\/16х4 
8 
= 
8 T 


Correlation of Time Series. If the values of two variables 
are grouped and the frequencies of different groups are given, 
then we will have to adjust the working of coefficient of 
correlation. Frequencies in each group are related to both the 
variables. There are two methods of calculating coefficient of 
correlation of such series. , The first method is direct. method. 
Under this method standard deviations are calculated separately 
and for caleulating x xy we prepare an extensive table. The 
second method is short cut. Under this method the table is so 

23 
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adjusted as to enable the caleulations of standard deviations also. 
Direct Method :— 


Illustration —132 


Caleulate the coefficient of correlation between the ages of 
husbands and wives from the undernoted data and comment 
upon the result obtained :— 


Ages of wives 

. Ages of 20 
husbands 10—20 | 20—80 | 30—40 | 40—50 | 50—60 | Total 
10—20 6 8 UE. AR PE 9 
20—30 3 16 10 = — 29 
80—40 = 10 15 " — 32 
40—50 em = у 7 10 4 21 
50—60 — ema 4 | 5 9 
Total 9 29 32 21 9 100 


(B. Com. В.Н. U.) 


Standard Deviation of Ages of Husbands :— 


‚ т т. 0. if da (35) | fdz | fda? 
10—20 15 9 —20 —180 3600 
20—30 25 29x31 251070 —290 2900 
30—40 35 32 0 0 0 
40—50 45 21 +10 +210 2100 
50—60 55 9 +20 --180 3600 
100 — 80 | 12200 
са cem EIE RN. 
12200 /— G Шү 
100 ~ \ 100 
—\/122—*64 =\/121°36 
—11*01 


(Standard Deviation of y Series will also be the same as 
frequencies and classes are the same in this question.) 
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0086 0082 0081 0 0055 0008 
0086 00ї 6 IG ae 62 6 
0085 6 00059 оов Y — — — 05- | 99 
0081 IG оов Y ооотОТ. ой — = 014 | + 
0 68 ест amie oSI a FOT = 0 sg 
0058 65 — — 001 отт ooo € от— | sz 
0008 6 EE T One 009 $ 0059 oz— | 91 
oz+ 01+ 0 01— 05— 
| og ep 38 95 т 
^ de | аот, 09—09 09—07 07—08 08—05 02—01I 
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sxy—N( 2 5 Е: 


Мо, oz 


Кы 


80 \/—80 
_ 9800—100 [жез = zm 09 
—100x11*01c11701 


OTRO КУ.» 
12186 “T 802 


By Short cut Method. The construction of extensive table 
for the purpose of calculation of coefficient of correlation, 
following steps should be taken. 


1. Find out mid values of class intervals of x and y series. 
Assume any point as origin and calculate deviations, and show 
them by dx and dy respectively. 


2. Multiply the frequencies of x series with dx in order 
to gety fdx. 


3. Multiply fdx with dx in order to getxfdx?. 


4. Same procedure of 2 and 3, should be followed for 
calculating уау and xfdy?. 


5. For calculating xy, starting with the top horizontal 
array, multiply dx and dy of the cell frequency and then it is to 
be multiplied with the cell frequency. The resulting figure to 
be put down in the top corner of the cell. The same procedure 
should be followed for each cell. 


6. Total all the top figures of each row and put it in the 
column of xy. Total of this column will be y xy. 


7. As this is a class frequency table, the formula will be— 


NXXy—N (SEE) 


EPA EM и (ater) 
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86 85 81 0 55 0g Ах 
661 98 IG 0 65 98 БЁР} 
Р я st+ 16+ 0 6z— S ру 
501 |8— 001 6 13 58 65 6 18191, ~ 
98  |SI— 6 os s? == = — jet | ost pu 09—09 
т& |Т IG И 0101 “of = — |t+ | orc lap | 09—07 
0 0 58 — ofa ofl о OL — (0 0 38 | 07—08 
65 |65— | ez z = 001 0191 s |1 | or— | 98 | 08—05 
98. | SET 6 = == = 98 +9 S— | ос ESD | 02—01 
E хрр | [OL 5+ icr 0 I 8 — T 
05-Е от+ 0 0r— 05— BS VOX 
gg oF се 95 ет SES 
09—03 | 09—07 | 07—08 | 08—05 05—0т 5 © 
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zfdx zfdy 
эм) к” 
I NE xfde ) 2 у{ду? Gs) 2 
х (кхм (у 
ERO SA 8 ) 

98-100 (S55) (Gor B 
== us в ү ia E 
100 Nioo — (100 100 ) 100 

| grae KUCAS 


100/1'2136x1:2136 12136 

Probable Error, Probable error is an amount which if 
added to and subtracted from the coefficient of correlation 
produces range within which coefficients of correlation of other 
groups selected from same series at random will fall. According 
to Wheldon, *Probable error defines the limits above and below 
the size of the coefficient determined within which there is an 
equal chance that coefficient of correlation similarly calculated 
from other samples will fall.” The Probable error is calculated 
by. using the following formula— 


1—r? 
p.E.—'6745 — — 


Where "6745 is a constant number, r—coefficient of 
correlation and N—number of items 


Probable error of the above illustration r of which is '8 
will be 


1— ‘8? 1—^*64 
-—'674 о : 
Т 57100 6745 10 


36 
a — == "02 
6745 10 024 
The coefficient of correlation is written as— 
r—-*8 +024 
The limits of the above coefficient of correlation would be 
*8-+-°024—"824 and '8—'024—:776. If another sample of 100 
husbands and wives is chosen at random from the same universe 


from which this sample has been taken the value of coefficient of 
correlation will lie. between-|-.824 and -1-.776. 
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Probable error is regarded as a measure of significance of 
correlation coefficient. A few rules for the interpretation of the 
significance of correlation based on the use of probable error are 
as follows— 


(1) If *r' (coefficient of correlation) is less than the probable 
error, there is no evidence of correlation. 

(2) If *?' is more than six times the probable error, 
correlation is significant, that is its existence is a certainty. 

(3) If the probable error is relatively small correlation 
should not be considered at all marked when г is less than *3. 

(4) If the probable error is small, correlation is directly 
existing where “т” is above '5. 

Test of Significance. Another simple method of testing 
whether r differs significantly from 'zero' provided *N' is large is. 
e LR 

VN 
If the value arrived at by this test is greatér than the 
observed or computed value of correlation coefficient, correlation 
is not significant. If computed value of coefficient of correlation 
is greater than the test value, correlation is said to be significant. 


Spearman’s Ranking Method. This method of ascertaining - 
the coefficient of correlation by ranks was invented by Professor 
Charles Spearman. It is based on the ranks of the variables. 
Variables are assigned ranks according to their sizes. This- 
method is to be applied when data are irregular or when extreme 
items are either erratic or inaccurate. Moreover it is applicable 
to individual observations rather than frequency distributions. 
Under Ranking method original values are not taken into 
account. Therefore the result obtained is only approximate. 

The formula for coefficient of correlation according to this 
method is :— 

- TAM ee Ф 
p (Pronounced as rho') or take r= мае 
Where p or г stands for coefficient of correlation 
d stands for difference between ranks 
N stands for number of cases. 

Method of Ranking. In this method the biggest item gets 
the first rank, the next biggest second rank and so on. But 
difficulty may be encountered where two or more items are of 
equal value. In such case one of the following methods, 
preferably the second, should be used. 
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1—The Bracket Rank Method—Under this method all such 
items having equal values are assigned the same rank and the 
next item is assigned a rank which ought to have been assigned 
in the absence of such tie e.g. 
Items 30 32 35 35 40 42 
Rank 6 5 8 3 2 1 
2—The Average Rank Method—Under this method all items 
with ties are assigned the average of ranks assignable to all 
of them. The next item is assigned the usual rank. e.g. 
Items 30 32 35 35 40 42 
Rank В О! д 
The value of the Karl Pearson's coefficient of correlation is 
not usually the same as that of the coefficient of correlation as 
certained by the ranking method. Professor Thurstone in his 
"Fundamentals of Statistics’ has given the following table 
showing the values of Pearson’s coefficient of correlation 
corresponding to various values of the rank correlation 
coefficient. 


Correlation between Rank and Pearson's coefficient. 


Rank Correlation coefficient Pearson's coefficient 
.00 .000 
.05 .052 
.10 .105 
.15 .157 
.20 .210 
25 .261 
.30 318 
.85 .364 
.40 .416 
45 .467 
.50 .518 
.55 .568 
.60 .618 
.65 .668 
70 717 
75 .765 
.80 .813 
.85 .861 
.90 .908 
.95 .954 


1.00 1.000 


рма ——— —— —————ÁÀ oM 
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Illustration —14 
Calculate the coefficient of correlation from the following 
data by the method of rank differences. 
x— 75, 88, 95, 10, 60, 80, 81, 50 
у—120, 134, 150, 115, 110, 140, 142, 100 
(М. Com. Vikram) 
X Ranks p Ranks | Difference between а 


Ranks 
d 
75 | в | 380 b б e 
88 | 2 184 4 iB 
95 1 | 150 1 " Н 
70 6 | 115 6 d f 
60 7 110 7 0 ^ 
80 4 140 3 "5 Я 
81 3 142 de 41 d 
50 8 100 8 A T 
Nos |__| Nee NONE 
rata) =i- = Е. 
—1—.07=--0.98 


Illustration—15 
Caleulate the rank coefficient of correlation of the following 
data :— 
x—-80, 18, 15, 15, 68, 67, 60, 59 
У 13, 14, 14, 14, 16, 15, 17 


X Rank Y Rank | Rank difference d d? 

80 1 12 8 —7 49 

78 2 13 7 —5 25 

75 3.5 14 5 —1.5 2.25 

75 3.5 14 5 —1.5 2.25 

68 5 14 5 0 0 

67 6 16 2 +4 16 

60 7 15 8 +4 16 

59 8 17 1 +7 49 
NS INS 159.50 
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1 6zd? 1 6»(159' 50 
те). 
957 
=1— g 19=—9 


Illustration —16 


Ten competitors in a beauty contest are ranked by three 


judges in the following order :— 


ist Judge—1, 6, Б, 10, 3 2, 4, 
2ndJudge—3, 5, 8, АКШИ 10,5419, 
3rd Judge—6, 4, 9, б ЫЕ ду 


8 
9 
7 


Use the rank correlation coefficient, to discuss which pair of 
judges have the nearest approach to common tastes in beauty. 


(M. А. Аа.) 
RankbylRankII| Rank| d | (| а(т|#|а(п| @ 
IJudge Judge ПТ |(1& II) & IT) &IID) & IIT) 
Judge | | 
1 3 6 ETSI uu gg uos 9 
6 5 4 plor ° | 4| 41 1 
5 8 9 —8 9 —4 | 16 —1 1 
10 4 8 h +6 36 2 4 —4 16 
3 7 1 —4 16 | +2 | ^| +6 | 36 
2 10 2 —8 64 0180-18 | 64 
4 2 3 +2 dedi S PEN ET 1 
9 1 10 +8 0. |. ST 
7 6 Б Че] TAP Jg ер 1 
8 9 7 1 1} +17] 1] +e 4 
N-10/! N—10| N—10 200 60 214 
4. 63d А 63200 
"РТ осчу —1— паз 
1200 nat 
=1— Gy t 2=— 2 
6 xd? 6x60 
FOSID-—1— куку foo 
360. 
=1— 999 = + 64 
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6d? 6x214 


пап 

A D=- NN D i0(10—1) 
1284 Д à 

=1— ggg =1—-18=— 8 


Thus first and third judge have highest similarity of tastes 
of beauty. 


Concurrent Deviation Method. This is a simple, easy to 
calculate measure of correlation. Under this method deviations 
are calculated not from the arithmetic mean, but from the 
preceding item and it considers only the direction of the 
deviation and not its extent. Deviations are marked'-|-'in -the 
case of increase ‘—’in the case of decrease and—when there is 
no rise or fall. All the pairs are then carefully observed and 
those having the same signs for both the items of the pair, 
known as concurrent Deviations, are marked-+in the Column of 
‘Concurrent Deviations’. The formula for caleulating coefficient 
of correlation is :— 


When C stands for the number of concurrent Deviations, 
N is one less than the number of pairs of items or equal 
to the number of deviations 


n 20—N 
The use of-[-or—sign will depend upon the sign of ( N ) 


If it is in minus then a minus is placed, if it is plus, then all 
signs are to be plus. 


This method is not regardad as a satisfactory measure of 
correlation, because no difference is made between big and 
small changes. 


Illustration—17 


Following are the marks scored by a group of 11 students in 
Commerce and Economics :— 
Student Ate CO DE F OG WI J:K 
Marks in Commerce 65, 40, 35, 75, 63, 80, 35, 20, 80, 60, 50 
Marks in Economies 60, 55, 50, 56, 30, 70, 40, 35, 80, 75, 80 
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Caleulate the coefficient of correlation by the method of 
concurrent Deviations. 


Student | Marks in |Deviation| Marks in Deviation Concurrent | „ 
Commerce| from Economies| from Deviation 8 
x preceding и preceding Е 
Student Student © 
dx dy С $ 
a 
A 65 60 
B 40 = 55 — a 
C 35 = 50 E an 
D 75 TE 56 sis fe 
E 63 — 30 — db 
F 80 + 70 Er ар 
G 35 — 40 = EE 
H 20 — 35 = ай 
1 80 + 80 aj + | 
J 60 — 75 = Р 
K 50 — 80 E - 
| N=10 N=10 C—9 [1 


Substituting these values in the formula 


ret ale SE ие get 


=+V+.80 —-L-0:89 


Illustration —18 


Caleulate coefficient of correlation by the method of 
Concurrent Deviations from the following data :— 


Average No. Average No. Average No. Average No. 
Weeks | employed |of bales con- | Weeks | employed of bales con- 
daily in | sumed daily | daily in | sumed daily 
thousands | in lakhs | thousands in lakhs 
1 884 21 9 385 27 
2 385 24 10 415 81 
8 362 20 11 418 32 
4 848 22 
5 384 26 
6 895 26 
т, 403 29 
8 400 28 
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Average number Average number of ] 
employed daily bales consumer 5 2 
daily - 2 
o 
Weeks | Number in | Deviations} Number in Deviations E Е 
thousands| from the Jakhs from the | 3 ? 
preceding preceding} 5 | А 
week week 
ГГ (Ln “eee 21 
II 385 + 24 3k a 
III 362 — 20 — + 
IV 848 — 22 + > 
у 384 + 26 + ar 
VI 395 E 26 = cA 
VII 403 + 29 + + 
VIII 400 — 28 — AF 
IX 385 — 27 — ap 
X 415 + 31 -+ + 
XI 418 + 32 + ar 
7 8|2 
3 20—N 
ELS Мам 
РИ 16—10 
== q+ 210 RISE 10 
6 


Types of Correlation. Correlation may be :—(a) Simple, 
(b) Multiple, and (c) Partial. 


When there is correlation between бур variables, опе 
independent and the other dependent, it is simple correlation. 
Multiple correlation is a measure of the combined effect of two 
or more independent variables upon one dependent variable 
e.g. there may be correlation between yield of wheat crop and 
both rainfall and temperature. Partial Correlation arises when 
two variables may be correlated partly on account of the fact 
that each of them is correlated with a third factor. In such 
cases we recognise more than two variables, but find out 
correlation between two only keeping third as constant. For 
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example, yield and rainfall may be correlated keeping the 
influence of temperature constant. 


If there are three variables А B and C, and we wish to find 
the Partial Correlation between A and B quite apart from the 
influence of C, the formula will be— 


rAB— (rAC) x (rBC) 


ee at NE ТЫ 


Where 


r AB. C—Ooefficient of Partial Correlation between AB keeping 
C constant. 

т АВ =Simple Coefficient of Correlation between A and В. 

rBC =Simple Coefficient of Correlation between B and C. 

т АС =Simple Coefficient of Correlation between A and C. 


Illustration—19 


The Coefficient of Correlation between Sanskrit and English 
is--0'6, that between Sanskrit and Mental Ability 13--0`7 and 
that between English and Mental Ability is-L0'8. Find the 
Coefficient of Correlation between Sanskrit and English keeping 


mental Ability constant. 
Let A=Sanskrit 
B=English 
C—Mental Ability 
then r AB—'6 
г AC—'7 
т BC='8 


АВС 1 AB— (r AC) x (r BO) 
1—rPACx yVi—rBC 


L]6— 1x8 
М1—*х/1—°8° 
_0°04 oo 
их Зб XS 
7096 =+°09 
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Lag and Lead in Correlation. Study of lag and lead is of 
special interest in the study of economic and business phenomena. 
In correlation of Time series, sometimes, it is found that a time- 
interval takes place before a cause and effect relationship is 
established. The quantity of money in circulation may increase 
today, but its effect on price level may not be instantaneous, it 
may take sometime prices to adjust with the quantity of money 
in circulation, This time gap in the cause with the effect is 
called ‘lag’. While calculating coefficient of correlation, this time 
lag must be taken into consideration, otherwise misleading results 
will be arrived at. The pairing of items is adjusted according 
to the time lag. 


If supply effects the price in say two months then in the 
following data pairing will be done as— 


Month Supply Price 
January ae 80— 146 
Feb. 122 489 | 140 
March РИ 86 | ——_ 30 
April ее 
Мау J 83 188 

- June 25 85— 127 
July b 894 | 115 
Aug. ah 96 | И 
Sept. 93 100 


Correlation with the Method of Least Squares. Correlation 
can also be studied between two variables with the help of the 
line of the best fit as obtained by the method of least squares. 
According to this method, first we have to find out an equation 
which would give the best possible values of y variable (relative) 
for given values of x (subject). With the help of this equation 
values of y for given values of x are calculated. The 
mathematieal equation describing the relationship of x and y 
measures the relationship between them. This method of least 
Squares gives the line of the best fit. It is called *method of 
least squares' because the sum of squares of all deviations from 
this line will be the least. 

The standard deviation about the line of the best fit is 
called Standard Error of the Estimate. This is calculated by the 


formula 
sec 
з= у 


Where Sy—Standard Error of Estimate 
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> d?— Total of the squares of deviations of actual values of y 
from the computed values of y. 
N-Number of item. 


The line of best fit is based on the equation- - 
y=a-+bx 
With the help of following two normal equations values of 
у are computed for given values of x. 


3(y)=Na--b x (x) 
z(xy)—az(x)--b х (x2) 
The various steps in the calculation are :— 


(1) The calculation of the constants a and b in the equation 
with the help of ‘best fit’ equation and two normal equations. 


(2) Then probable values of у are calculated for given 
values of x by applying the equation. 


(3) Deviations are found out between actual and computed 
values of y 


(4) Deviations are squared up and totalled. 
2 
(5) Find the mean of this Quantity X * This is known 


as variance of the regression line. 
(6) Caleulate variance of the actual values of y 


a. zdy? 
«ўїз N 


(7) Coefficient. of correlation is caleulated by the following 
formula. 
ву? 
"gy. 
і су? 


Illustration—20 


Caleulate the coefficient of correlation of the following two 
Series by the least squares method— 


X У 
1 80 
2 90 
3 92 
4 88 
5 94 


6 99 
7 92 
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x 2 y ay Computed values 
of y 
1 1 80 80 84 
2 4 90 180 86 
8 9 92 276 88 
4 16 83 882 90 
5 25 94 470 92 
6 86 99 594 94 
7 49 92 644 96 
Zx—28 140 630 2576 630 

y—a--bx 


Two normal equations are— . 

х (y)=Na-+h 5 (x) 

$ (xy)=a zGO-Fbz (x)? 
by substituting values we get 

630— Tat 28b (1) 


2576—28a-1140b (2) 
2520—28a-1-112b (3) Multiply (1) by 4 


56— 28b  Deduct from (2) 
2— b 
By substituting the value of b in any equation 
630—7а-56 
ог 630—7a--56 
630—56 = 7а 
574 —Ta 
| 82 = а 
Substituting the values of a and b in 
y—a--bx 
or 
y—82--2x 
Now values of y will be found out for different values of x 
When x—1 =у=82--2Ж1=84 
» о =у=82--2Ж2=86 
„Жав =у:82--258—88 
E —=у=82--2Ж4=90 
» 5 =у=82--2Ж5=92 
я ү; с ү! —y-—82--2x(6—94 
Det —y-—82--2x"1—96 
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Diffence between deviation from 
2 y y actual and d? |average of y from| dy? 
observed values the original 
of y (d) values of y 
1 80 84 =4 16 —10 100 
PLD IU gaisg +4 16 0 0 
з | 92| 88 +4 16 dora 4 
^4 83 90 -— 49 — 7 49 
| B | 94) 92 +2 4 Шш 16 
6 | 99| 94 4-5 25 diem 81 
T 92 96 —4 16 + 2 4 
45 630 | 680 142 254 
630 
ay = —=90 
Standard Error of Estimate \ 
2 
Sys ag m AME 
Does 8 —4-5 


Standard Deviation of y Series 


2 254 ` 
усе A. = 42 Б4 


-—V863 -—602 


Coefficient of Correlation 


ud SYN 4 20°38 
r= yi oy” М seg | 
=V1—"56 =\/:44 —-r'66 

Coefficient of Correlation ealeulated by the Karl Pearson's 


formula and by the least squares method will always be the same. 
In the above example x xy will be—56 and ex Will be—2 


г— З®У 56 _ 56 
— Мою X2x6:02 = 84:28 
=+'66 
and Sy= ey /1—1r? —6*02 /1—: 662 
—6'021—'44  —6'02 56 
==6'02у "75 =4'5 
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Short-cut Method. Coefficient of correlation by the Least 
Squares method can also be ealeulated by a short-cut method. 
Under short-cut method Sy and „у are not required to be 
calculated. Besides the computed values of X y, the value of x y? 
has also to be obtained. The formula under short-cut method is— 


‚аз GENE 
vn > (у2) — №еу? 


Where ‘су’ is the difference between the mean y and the 
origin employed in calculations. Let us calculate ‘г by this 
method in the above example :— 


v у y? 
1 80 6400 
2 90 8100 
8 92 8464 
4 83 6889 
5 94 8836 
6 99 9801 
7 92 8464 
630 56954 


Dee 0269005 (уте) BOX) 
СЕ 5695407590290) 


112 Ling pd 
=: A nV H+ 66 


Coefficient of Determination. Total variations in the y 
variables may be due to two factors—(1) variations in the x 
variables and (2) variations unaccounted by the x variables. 
That proportion or percentage of the variation in y which may 
be accounted for by differences in x is known as coefficient of. 
Determination. This coefficient of determination is a very useful 
tool in correlation analysis. Coefficient of Determination is 


In the above example Coefficient of Determination will be 


12—'662—='44 - 
Sy _,_ 209 4 5g 
or 1— SITES 368 
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ог (1—12)=К? and is 


2 
Variations not accounted for is P 
c. 


called coefficient of nondetermination. 
Es 1?-- K?—1— Total variations, 
УКК is called coefficient of Alienation. In the above example 
K2—1—'44—:'56 
Coefficient of Alienation= K? — '75 
K measures the extent of departure from perfect correlation. 


Theoretical Questions 


1. What is meant by correlation ? Give the general rules 
for interpreting its coefficient. (M. Com. Alld.) 
2. Explain the meaning and significance of the concept of 
correlation. How will you claculate it from а statistical point 


of view ? (M. Com. Agra) 
3. Define correlation and distinguish between positive and 
negative correlation. (Agra. M.A.) 


4. Discuss the problems involved in correlation analysis in 
the case of time series and state how they can be solved. 
(M.A. Alld.) 


5. Write notes on :— 
(1) Seatter Diagram, (2) Lag and lead, (3) Positive and 
Negative Correlation, (4) Probable Error. 
Practical Questions 


1. The following table shows the marks obtained by ten 
students in Accountancy and Statistics :— 


Student No 7 8 9 | 10 
Kecountancy | E 70 | 65 бый 90| 40| 50 75 85 60 
Statistics | [a5 901 70 | 40 | 95 “95| 40 40! 60 80 80 50 


Find the Coefficient of Correlation (B. Сот. B.H.U. ) 
(Ans. r—--0'903) 


2. Find the coefficient of correlation :— 


Roll No n ai Sco A RC 4 |5 6| 7 8 9 |10 
Statistics 80 | 60 | 51 z 58| 62| 64 | 72 | 56 | 58 
62| 68! 48 | go | 62 | co 


Law 45 | 71 60 57 


(Ans. r——0'676) (B. Com. Raj.) 
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3. Calculate the correlation coefficient between x and y given 
by the following data :— 
х—25 29 88 48 45 49: 55 59 
y—52 57 69 76 79 84 92 96 
(Ans. r—-|-0'9993) (B. Com. B.H.U.) 


4. Calculate Karl Pearson's coefficient of correlation between 
the values of x and y given below :— 


x 42 44 58 55 89 98 66 
y 56 49 53 58 65 76 58 
(Ans. r—--0'904) (B. Com. Agra) 


5. Find the correlation between the income and expenditure 
of a wage-earner on piece rate system working in a factory :— 


Months Income in Rs Expenditure in Rs 
October 46 36 
November 54 40 
December 56 44 
January 56 54 
February 58 42 
March 60 58 
April 62 54 
May 66 58 


(Ans. r—-L0'82) (B. Com. B.H.U.) 


6. The following figures give the capital employed by a firm 
in ten successive years, together with the Profit made in each year, 


both in thousands of rupees :— 


UL DNI acc] A 
Capital | 10 | 20 | 80 | 40 | 50 во | 70 | 80 | 90 р 
Profit 2 м | 8 z | 10 | 15 | 14 | 201 221 80 

fo we Е АЕ 


Find the coefficient of correlation and state your deductions. 
(Ans. r=-+0'96) (B. Com. B.H.U.) 


7. Calculate Karl Pearson’s Coefficient of correlation 
between the ages of husbands and wives and comment on the 
result :— 


Ages of husbands 20 30 40 50 60 70 80 
Ages of wives 14 25 80 32 40 45 65 

(Ans. r—--0'96) (B. Com. B.H.U.) 

8—What is meant by а Correlation Coefficient ? The 


following table gives the income and expenditure on ipod, pf 10 
working class families :— ti 
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Income of family Expenditure on food 
in Rupees in Rupees 
20 10.2 
25 1 12.8 
85 ч 15.9 
45 19.6 
55 22.6 
65 26.8 
75 29.4 
85 82.0 
100 42.5 
105 48.0 
Caleulate the Correlation Coefficient between income and 
expenditure on food. (B. Com. B. H. U.) 


(Ans. r—-L0.994) 


9—Caleulate Pearson's Coefficient of Correlation between 
wages and cost of living from the following data :— 


Index Numbers 


Wages — 100 101 103 102 100 99 97 98 96 95 
Cost of living. . 98 99:99 97 95 92 95 94 90 91 


B. Com. Agra 
(Ans. r=-+0.85) Ci ue deno 


10—Caleulate the Coefficient of correlation from the following 
table and interpret it :— 


Year Average daily No. Number of bales 
of labourers consumed by mills 
(000) (00,000) 
1925 868 22 
1926 ; 384 21 
1927 : ‚ 885 24 
1928 861 20 
1929 847 22 
1980 884 26 
1981 $ 395 26 
1932 403 29 
1933 400 28 
1934 385 27 
: (B. Com. Agra) 


(Ans. r=-+0.79) 


11—Find out if there is any relationship between, the number 
of co-operative societies and indebtedness per family. 
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No. of societies Indebtedness per family 
43 163 
0 165 
21 880 
18 481 
12 440 
12 487 
31 594% 
16 710 
30 794 
86 850 
(Ans. т=-0.41) (В. Сот. Арта) 


19— Calculate the coefficient of correlation for the following 
ages of husbands and wives :— 


Husband's age Wife's age 
28 18 
27 22 
28 28 
29 24 
30 25 
31 26 
33 28 
35 + 29 d 
36 30 
39 é 32 
(Ans. r=+1) (М. A. Agra) 


13— Ten students got the following percentage of marks in 
Principles of Economies and Statistics— 


Student— die. ВА D Хб. 8. 9 10 
Marks in Economies— 78 36 98 25 75 82 90 62 65 89 
Marks in Statistics— 84 51 91 60 68 62 86 58 53 47 
Find the coefficient of correlation. 
(Ans. r=-+0.78) (M. А. Agra) 


14—Caleulate the coefficient of correlation between the 
income and the general level of prices from the following data :— 


Year Income General I. No. 
in Rs. ` of Prices 
1949 360 100 
1950 420 104 
1951 500 115 ° 
1952 556 160 
1958 600 280 
1954 640 290 
1955 680 300 
1956 720 820 
1957 750 330 


(Ans. r—--0.96) (M. A. Agra) 
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15—Caleulate Karl Pearson's coefficient of correlation from 


the following data :— 


Series A Series B 
112 200 
114 190 
108 214 
124 187 
145 170 
150 170 
119 210 
125 190 
147 180 
150 180 

(Ans. r——0.9) (В. Com. Alld.) 


l6— Compute the coefficient of correlation of the short-time 


oscillations from the following data— 


Year Supply Price 
1951 » 80 146 
1952 82 140 
1953 86 130 
1954 91 : 117 
1955 83 188 
1956 Р 85 127 
1957 89 115 
1958 96 95 
1959 93 100 


(Assume a three year cycle and ignore decimals) 
(Ans.r——0'99) ` (M. A. Allahabad) 


17. Find if there is any significant correlation between 


heights and weights given below. 


Height in inches Weight in lbs. 
57 118 
59 117 
62 126 
63 126 
64 130 
65 129 
55 " 111 
58 116 
57 112 


(Ans. 1=--0"98) (B. Com. Alig.) 
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18. During 1910-50 in Uttar Pradesh the five yearly averages 
of the percentage area (i) net area sown and (ii) wasteland 
including current fallow were as follows :— 


(1) (ii) (i) (ii) 

51'9 19°9 52°6 19°2 

51°7 20°0 53°2 18°6 

51°7 19'8 54'2 181 

511 20'5 559 18'6 
Calculate the coefficient of correlation (M. A. Alig.) 


(Ans. r——0'85) 


19. The following table gives the frequency according to age 
groups of marks obtained by 67 students in an intelligence test :— 


Age in years 

Test Marks 18 19 20 21 Total 
200—250 4 4 2 1 11 
250—300 8 5 4 2 14 
300—350 2 6 8 5 21 
350—400 1 4 6 10 21 
Total 10 19 20 18 67 
Is there any relationship between age and intelligence ? 

(Ans. r=-++0°41) (B. Com. Agra) 


20. The following table gives the number of students having 
the different heights and weights. 


Өй, [з= Se 
Weight in Pounds 
Height in | — — m5 


inches | 80—90 | 90—100 100—110 110—120 |120—130 Total 


—— 


| КА 


50—55 2 6 12 10 5 85 
55—60 5 7 20 13 8 52 
60—65 2 11 25 20 13 71 
65—70 0 6 17 14 5 42 
Total 8 зо | 74 57 31 200 
e: par ILU NEN CRT 


Do you find any relation between height and weight ? 
З (В. Com. Agra) 
(Ans. r=-+0"06) 


21. The correlation table given below shows the ages of 
husband and wife for 58 married couples living together on the 
census night of 1941. Caleulate the co-efficient of correlation 
between the age of husband and thati of his wife. 
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Age of Age of wife Total 
Husband 15—25 25—35 35—45 45—55 55—65 65—75 | 
15—25 1 1 20 
95—35 ° 12 1 15 
35—45 4 10 Èi / 15 
45—55 8 6 1 10 
55—65 2 4 2 8 
65—75 1 2 3 
Total 3 17 14 9 6 4 58 


(I. A.S) 
(Ans. r—--0'908) 


22. Explain correlation and calculate coefficient of correla- 
tion between ages of husbands and ages of wives in the following :— 


Р Ages of wives 
offe E ME UM c uU м 
na 10—20] 20—30| 30—40 | 40—50 | 50—60 | Total 
15—95 6 3 E x = 9 
25—35 3 16 10 ase EUM og 
35—45 grs TE 18 7 EE. ва 
45—55 EN = 7 10 4 21 
55—65 EN Ee enc = 4 5 9 
Total 9 29 32 | э1 о | 100 


(М. А. & M. Com. Agra) 
(Ans. r=-+.802) 


23—Compute r between age in years and marks obtained 
from the following tabulated data. 


Age in years X 
п 4 i T e QOL HEAD UP PEST, Total 
Y 16—18 | 18—90 | 20—22 | 22—24 

10—20 2 1 1 4 
20—30 8 2 8 2 10 
30—40 8 4 5 6 18 
40—50 2 E 3 4 11 
50—60 Ча 2 2 5 
60—70 1 2 T 4 

'Total 10 11 16 15 52 


(Ans. r—4-0.28) 
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24—Calculate the coefficient of correlation between the ages 
of husbands and wives from the following data. Also estimate 


the probable error of the coefficient. 
tao probant T т кшн о 


Ages of husbands 

Ages of | | Total 
wives  |20—80 | 30—40 | 40—50 | 50—60 | 60—70 

15—25 5 9 | 8 17 
25—35 10 25 2 37 
35—45 1 12 2 15 
45—55 | 4 16 5 25 
55—65 | 4 2 6 
Total Б 20 | 44 24 7 100 


(Ans. 1r=-+0.7962) (P.E—.0247) 
25— Calculate the coefficient of correlation between the X and 


Y values given in the following table. 
MILCH ито ве сс 


Values of X 
Values of ` Totals 
y 10— | 15— | 20— | 25— | 30— | 35— 
И CATS 
О 1 3 1 | 5 
18— 2 5 7 6 | 20 
Zi P. 5 8 12 3 30 
24— 4 4 8 7 2 25 
27— 6 6 8 15 
8025 n 1 5 
Totals |. 5 17 | 90 32 20 6 100 


(Ans. r—--0.623) 
26— Find the coefficient of correlation between marks obtained 
by 60 candidates in Economics and Statistics from the following 


data. 

data. .—. jo MU eua quc m 

Marks in Marks in Statistics X 

Economies| | Total 

X 5—15 | 15—25 | 25—85 35—45 

EAE _ Жы зщ 
0—10 1 1 2 
10—20 3 6 5 1 15 
20—30 1 8 9 2 20 
30—40 3 9 8 15 
40—50 4 4 8 

sae | ЗАР узи а a эрг т. 
Total 5 18 27 | 10 60 


EE 25 ci есе 
Also caleulate the probable error of r. 
(Ans. x—--0.583) (P.E.—.066) 
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27—The following table presents 100 couples classified 
according to the age of the pairs at the time of marriage. Is 
there any correlation in their ages ? 


Husband's Age 
20 —— 25 ——. 80 ——. 88 —— 40 — 4 
xv. Aug 10 2 — — 
кыкыо кшш шна ке LM iita. 
boj 4 28 6 2 — 
nd 
L à = 5 11 4 2 
Ei} — — 1 2 1 
R 
а — — — 1 1 
What light do these figures throw on the marriage custom 
of the people ? (M. Com. Agra) 


(Ans. r—--.91) 

28—The following table gives class frequency distribution 
of 45 clerks in a business office according to age and pay. Find 
the correlation if any between age and pay :— 


ay 
60 70— —80———90———100 110 
f 20—30 4 3 1 — — 
| 30—40 2 5 2 1 — 
Авеј 40—50 1 2 8 2 1 
| 50—60 | — 1 3 5 2 
l 60—70 | — — 1 1 5 
(Ans. r=-+0.749) (M. Com. Agra) 


29—Find the correlation, if any, between height and weight 
from the following class frequency distribution :— 
Hight in inches 


59—— — ——61— —— — 68— ———— —65—— 67—09 
я 12 8 6 2 0 
= 
ES 
on 6 10 4 8 1 
e 
Н 5 р ie n. 
| 8 16 10 2 
58 
Ф 1 | 
ES 2 8 5 8 8 
La] 
g 
= 1 4 | 5 6 14 


(М. Com. Арта) 
(Ans. r=-+0.54) 
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30. Caleulate the coefficient of correlation between x and y 
series given below by the method of Rank Differences. 


x: 78 ВОЗОТ ОСОРО оэ 57 
у: 125 187 156 112 107 186 128 108 
(Ans. r—--0.95) 
31. Following are the marks obtained by 10 students in 
two tests. 


Student AUS chp вт ант) 


|__| | ———— 


Test І 70 | 68 67 55 60 | 60 | 75 | 68 | 60 |72 


Test 11| 65 | 65 | 80 60 | 68 | 58 | 75 62 | 60 |70 


Caleulate the rank coefficient between the two tests. 
(Ans. r—--0.751) 


39. Find the coefficient of correlation from the following 


хш 5 10 15 20 25 30 Total 
10 : 1 1 2 8 12 24 
15 1 2 5 9 80 11 108 
20 2 15 42 98 36 8 201 
25 5 20 51 37 10 2 125 
80 8 16 8 5 4 1 42 
Total 16 54 107 151 188 БУЛ 500 


(M.A. Calcutta). 
(Ans. r=— 0.58) 


hows the distribution of marks. 


33. The following table s 
ion and its probable error :— 


Calculate the coefficient of correlat 
Marks їп Geography 
0—20 20—40 40—60 60—80 Total 


.$ Range of Marks 
3 
© 
© 0—20 32 88 15 R 135 
E 20—40 45 436 200 4 685 
2 40—60 16 500 398 25 989 
25 60—80 w 105 532 40 677 
= 80—100 К 8 40 16 64 
8 и 
= Total 93 1,187 1185 85 2,500 


EU! (M.A., Calcutta). 
(Ans. r=-+.4889 + .01) 
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34. Find the correlation coefficient between heights of fathers 
and sons from the following data :— 


Height of father in inches 


65 
66 
67 
67 
68 
69 
71 
73 


(Ans. r—-4-0.47) 


Height of son in inches 


67 
68 
64 
68 
72 
70 
69 


70 
(M.A. Alld.) 


85. Calculate product-movement coefficient of correlation 
between x and y from the following data :— 


x y 
46 88 
50 71 
33 62 
40 60 
27 40 
44 57 
60 61 
38 50 
42 53 
25 28 
(Ans. r=-}.62) 


(Gujrat. B. Com.) 


36. Calculate the coefficient of correlation between the values 
of x and y by Karl Pearson’s method. 


(Marks out of 100) 


х 


(Ans. т=-Ь 8%) 


y- 


(Marks out of 100) 


(Gujrat. B. Com.) 
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37. 'Ten students got the following percentage of marks in 
Principles of Economics and Statistics : 


Student Percentage of marks Percentage of marks 
in Economics in Statistics 
1 78 84 
2 36 B1 
8 98 91 
4 25 60 
5 75 68 
6 82 62 
7 90 86 
8 62 | 58 
9 65 53 
10 39 47 


Calculate rank correlation coefficient. 
i (M.A. Agra) 


(Ans. r—--.92) 


38. The rankings of ten students in two subjects A and B 
are as follows : 
Subject A— B BRT ER АУТО EP ЕО 9 
Subject B— BY NB ОВ а "Br, 


What is the coefficient of rank correlation ? 
(M.A. Delhi): 


(Ans. r=—.3) 


89. Psychological tests of intelligence and of arithmetical 
ability were applied to 10 children. Here is a record of ungrouped 
data showing intelligence ratio (Т.В.) and arithmetic ratio (A.R.). 
Calculate “г. 


P m 
СыГА BY, C/ D| E 
Т. В. | 105 


А. R.| 101 


100 


95 | 96 


104 | 102 | 101 


—— 


103 | 100 | 98 


104 | 92 


(М. 5с. Арта) 
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40—The following table gives the frequency according to age 
groups of marks obtained by 75 students in an intelligence test :— 


Age in years 

езү, 5 T Total 
Marks 19 20 21 22 23 

0—20 4 4 2 1 1 12 
20—40 3 5 4 2 2 16 

Aa CER Ud dM хал УКТ АА ЫИ 

40—60 3 6 8 5 3 25 
60—80 cx 4 6 8 4 22 
Total 10 19 20 16 10 75 


Calculate the coefficient of correlation between age and 
intelligence. 


(M. A. Raj.) 
(Ans. r—--.31) 


41—Find correlation-Coefficient between age and playing habit 
of the following students. 


Age 15 16 i. 18 19 20 
No. of students 250 200 150 120 100 80 
Regular players 200 150 90 48 30 12 


(M. Com. Agra ; B. Com. Raj.) 
Hint :—Find the percentage of the regular players 
(Ans. r—-—.9913) ir „заи 


REGRESSION AND 
RATIO OF VARIATION 


"Correlation and regression techniques have been applied to 
more appropriate problems and infact they usually are. While 
correlation amalysis tests the closeness with which two or more 
phenomena co-vary, regression analysis measures the nature and 
extent of this relation, thus enabling us to make predictions." 


Werner Z. Hinscu— "Introduction to Modern Statistics” 


Regression means to regress or to return back. Technique 
of regression was developed by Sir Francis Galton towards the 
end of Nineteenth century while studying the relationship between 
the heights of fathers and sons. He introduced this word in his 
paper “Regression towards Mediocrity in Hereditary Stature”. 
He disclosed in this paper that on average, tall fathers had tall 
sons, but the sons tended to regress to the average male height. 
His investigation of the height of about one thousand fathers 
and sons revealed a rather interesting relationship. Tall fathers 
tend to have tall sons and short fathers short ones, but the 
average height of the sons of a group of tall fathers is less than 
that of the fathers, and the average height of the sons of а group 
of short fathers їз greater than that of the fathers. Galton 
began to describe this phenomena as one of the regression, a 
term that soon was used to designate the relationship between 
two or more variables. 


Regression studies average relationship between two vari- 
ables, Ву its help we can know the average probable change in one 
series given a certain amount of change in the other. If the 
coefficient of correlation between the ages of husbands and wives 
is +.8, it means that for each change of one year above or below 
the average age of husbands, there is a change of .8 years from 
the average in the age of wives in the same direction. 


25 
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There are two regression equations, that give best estimate 
of one variable when the other is exactly known or given. These 
equations are called x on y and y on x. From the equation x on 
y, we can know the most probable value of x for given value of y. 

"From the second equation y on x, we can know the most probable 

value of y for given value of x. Corresponding to these two 
regression equations two lines can be drawn on the graph paper 
and extent of relationship can be studied. These are called 
‘lines of regression’ or the lines of best fit. If there is perfect 
correlation both the lines will coincide. These lines cut each 
other at the point of average of x and y. Nearer these lines are, 
greater will be the extent of correlation. These regression 
equations are :— 


x=a-Lby (x on y) 
y=a-Lbx (y on x) 
In these equations x and y are the values for each other and 
a and b are constants. The values of a and b can be calculated 


with the help of normal equations. These equations for 
y=a-Lbx will be 

(y)=Na+b (x) 

$ (xy)=az(x)+b > (x?) 
and for x—a--by will be 


3(x)=Na+b3(y) 
z(xy)—az(y)--bz (у?) 


The coefficients of regressions are found by the formula :— 


regression coefficient of x on y or Bxy—r ?* 


regression coefficient of y on x or Byx—r ?Y 


Regression equations can be written with the help of 
regression coefficient as— 


х оп y— 
(x—x)—r ?*(y—y) 
oy 
and yon x 
(у—у)—т ЗУ (к x) 
ou , 


Where хапа y are the averages of x and y series respectively. 
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Illustration—1 
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Caleulate Karl Pearson's coefficient of correlation and the 
regression equations from the following data :— 


Age of husband Age of wife | Age of husband Age of wife 


18 17 28 19 

19 17 24 19 

20 18 25 20 

21 18 26 21 

22 18 27 21 
a dz ( da? y dy (18) dy? ay 
18 —4 16 17 —1 1 4 
19 Lg 9 17 —1 1 3 
20 239 4 18 0 0 0 
21 zx 1 18 0 0 0 
22 0 0 18 0 0 0 
28 a 1 19 +1 1 1 
24 +2 4 | (219 41 1 2 
25 18 9 20 -E2 4 6 
26 +4 16 21 +3 9 12 
27 +5 25 21 +3 9 15 
N=10! 4+5 85 | +8 26 48 

d 5 
a EE NC —22..,5- —225 
d 8 
am ERU —18-F15- —18.8 


_ lae _ (xdx )- quee) a 
SENN us N 10 
—\/85—25 =\/8.25=2.87 


(зау? э 26 еу 
ву= М -F =v- lio 


—\/26—64 —V196—14 
cof Py ) 


Можоу 


Б 8 
аз—1о(0—)(т— 
10x287TX14 


= 20, “8 GC 
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Regression coefficients— 


х on y or Bxy—r = = 97 АВТ. =1.98 
ту 14 
ор л 
у оп х ог Вух= ак =N 51 =.47 
Regression Equations— 
X on y— 
(x—x)—r (уу) 
oy 
Or, (x—22.5)=.97. M (y—18.8) 
Or, На didi 
Or, (x—22.5)—1.98y—37.22 
Or, x—1.98y—37.22--22.5 
x—1.98y— 14.72 
y on x— 


-y Nr Zaa) 
(y—18.8)—.97 2 =z — (x—22.5) 


Or, (y—18.8)—.47(x— 22.5) 
Or, (у—18.8)=.47х—10.575 
Or, y=.47x—10.575+18.8 
Or, у=.47х-- 8.225 
Coefficient of correlation can also be computed with the help 
of regression coefficients— 


Bxy=r ох апа Byx=r — 24 
су сх 


Hence BxyxByx—r 2 AU QU 
су ox 
=r? 

V Bxyx Byx—r 

In the above example— r=\/Bxy x Byx 
—\/1.98Х.47 =\/.9806 . —--.96 

Alternative methods of Calculating Regression Coefficients. 
Based on different labour and time saving devices of finding out 
standard deviation and coefficient of correlation, regression 
coefficients can be found out by other methods also. 
(1) (When actual average is used) 
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Вху-= ЖУ = (r= Уху х2) 

су Мохоу оу 
Улу („93 Уху „оу ) 
N z? ох Noxoy ох 
(2) (When assumed average is used) 


Byx— 


Муху - 5х.Ху 


Вху= L———2—g— 
Nzy? -(Zy) 
М Xxy— Хх. Sy 
Byx= т 
УХ Nyx? - (xy)? 


In the above example. 
43<10—5 X8 890 


Bxy= 10%26—(8)* = 190 sU 
43x10—5x8 390 
= ЕЕ ni 
BYX= i0y/85 (Буз ^ 825 7 
(3) (In original values) 
ita Sxy—Nay pxy—N xy 
Вху= ху Му? Byx— ye | 


Where x and у stand for means of x and y series 
respectively. 


Tllustration—2 


Obtain the lines of regression and show them on graph for 
the following data :— 


a 425 da y da (12) ах ay 
1 —4 16 9 —8 9 12 
2 —8 9 8 —4 16 12 
8 —2 4 10 —2 4 4 
4 —1 1 12 0 0 0 
5 0 0 11 —1 1 0 
6 +1 1 18 +1 1 1 
7 +2 4 14 +2 4 4 
8 +3 9 16 +4 16 12 
9 +4 16 15 +3 9 12 

345 Г 60 108 EU 57 
У 45 У 108 
а= 25 == ; w= a 12 


890 AN INTRODUCTION TO MODERN STATISTICS 
ox= \Р-=уєбб=?в 


ПОЕ у 
CEN o =\/666—2.6 


SA e ET uo 87 
"= "Nesey 9X26X2¢ 60 
Regression Equation x on y= 


(x-x )=r (уу) 
ay 


=.95 


Or, (x—5)—.95 2.6 (, 19 
5® (у ) 


Or, (x—5)—.95(y—12) Or, (x—5)—.95y—11.4 
Or, x—.95—11.4--5 Or, x—.95y— 6.4 


Regression Equation y on x— 
—y)—r 9Y(x yy3(y 12)—.95 29 (z 
(y—y)—r QC »-GG 12) —.95 28 (х—5) 


=(у—12)—.95(х—5) —(y—12)—.95x—4.15 
==у=.95х—4.754-12 
y—.95x--7.25 
According to x on y equation the values of x will be 
caleulated for given ys. x—.95y—6.4 
y (Estimated values of x) 
9—x—.95x. 9—6.4—2.15 
8—х—.95ҳ 8—6.4—1.20 
10—x—.95x(10— 6.4—3.10 
12—x—.95x. 12—6.4—5.00 
11—x—.95 x 11— 6.4—4.05 
13—x—.95x(13— 6.4—5.95 
14—x—.95 X 14—6.4—6.90 
16=x—.9516—6.4—8.80 
16=х=.95Х 15—6.4—7.85 


According to y on x equation the values of y will be 
calculated for given xs. 


у=.95х--7.25 


2 (Estimated values оў у) 
l1—y-.95x1-L7.25— 8.20 
2=y=.95X217.25— 9.15 
3—yc—.95x(3-L-7.25—10.10 
4—y—.95x(4-L7.25—11.05 
5=y=.95 X5-+-7.25—12.00 
6—=у=.95Ж 6--7.25=12.95 
7=у=.95Ж7--7.25=13.90 
—=у=.95Х8--7.25—14.85 
9=у=.95Ж 9-1 7.25=15.80 
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To Plot the regression line of x on y we shall take the actual 
values of y and the computed values of x and similarly to plot the 
regression line of y on x we shall take the actual values of x and 
computed values of y. 


Illustration—8 


A regression equation explaining the average relationship 
between the dividend per share and the price per share in 1960 
for 100 corporations was 


y=Rs. 5.49--12.14x 


Estimate the value ‘of a share of stock which pays a dividend 
of Rs. 5 per share. The standard Error of the estimate (Sy) 
is=Rs. 4.5 

(M. Com. B.H.U.) 
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y=Rs. 5.49.L12.14x 
—Rs. 5.49- 12.145 
—Rs. 5.49-L60.70 
—Rs. 66.19 


There is 68.3% chance that Из value will be 
Rs, 66.19 + 14.5 (Sy) Le. between Rs. 70.69 and Rs. 61.69. 


There is 95.4% chance that its value will be Rs. 66.19 
+ 24.5 i.e. between Rs. 75.19 and Rs. 57.19. 


There is 99.7% chance that its value will be Rs. 66.19 
+ 84.5 i.e. between Rs. 79.69 and Rs. 52.69. 


Illustration—4 
(a) Given :— 
x Series у Series 
Mean 18 100 
Standard Deviation 14 20 


Coefficient of correlation between x and y series—-0.8 


Find the most probable value of y if x is 70, and most 
probable value of x if y is 90. 


(b) If two regression coefficients are 0.8 and 0.6, what 
would be the value of the coefficient of correlation. 


(M. Com. Alld.) 


(a) To find out value of x when у=90 
X on y— 


(x—x)=r 2 (уу) 
су 


14 
„ (x—18)—.8 ——(90— 
Or. Gees) 8 20 (90—100) 
(х—18)—.56(—10) 
x=18—5.6 
х==124 


To find out the value of у when x—70 
y on x— 


(y—y) =r Y (x—X) 
ox 


i ty 20. 
Or, (y—100)—.8 a (70—18) 


Or, (y—100)=1.144¢52 
y=59.48-1100—159.48 


REGRESSION AND RATIO OF VARIATION 998: 


(b) 
r—WVBxyx Byx 
=\/8X.6 


ВЕ 
=—-+.69 


Illustration—5 


Ina partially destroyed laboratory record of an analysis of 
correlation data, the following results only are legible :— 


Variance of x=9 
Regression equations :— 


8x—10y-L-66—0 
40x—18y—214 


What were— 


(a) The mean value of x and y 
(b) The Standard Deviation of y 
(с) The Coefficient of correlation between x and y. 
(1. A. 8.) 


(a) Calculation of Mean Value 
8x—10y+66—0 
or 8x—10y——66 (i) 
40x—18y—214 (11) 
40x—50y——830 (1) 5 (deducted from ii) 


82y—544 
у ory=17 


Substituting this value of y in equation (i) we get 


8x—170=—66 
8x—170—66 
8x—104 
x orx—138 


(c) Caleulation of the coefficient of correlation 


r=\/BxyX Byx 

We have to calculate the values of the two regression 
coefficients for determining the value of ‘r’. From the question 
itself, however, we do not know which equation stands for the 
regression of x on y. We assume therefore that the first equation 
gives the regression of y on x and the second gives the regression 
of x on y. (If this assumption proves wrong, we will assume 
reverse of it and then find out т”) À i 
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From the first equation, we get 


10у=8х-- 66 
у=.8х-- 6.6 


Byx=r Č —1 8 
oy 


From the second equation we get 


40х—18у--214 
х=.45у--5.35 


Вху—т 7. —+.45 
ox 
Substituting these values in the formula 
т=\/ВхухВух 
—8Xx45 
—V.36 | 
=+.6 
(If our assumption would have been that first equation stands 
for x on y and 2nd for y on x then :— 


8x—10y—66 
x—1.25y—8.25 | 
or Bxy=1.25 
and 18y—40x--214 
y—2.2x--11.9 
or Byx—2.2 


Then г=\/ВхухВух -—125x22 
This will be more than 1, and г cannot be more than one. 
Hence our first assumption is correct). 


(a) Caleulation of Standard Deviation of y. 


Variance or op? —9, os =3 


Regreassion coefficient of x on y 


RESI gx 45 
су 
or .6 3 —45 
су 

or .6x(3—.45 су 

ог 1.8—.45 су 
1.8100 
EX 
4= су 


REGRESSION AND RATIO OF VARIATION 395 


Illustration—6 


For certain data y—1.3x and х—0.7у are the regression 
lines. Compute the coefficient of correlation between x and y. 
(M. Com. B.H.U.) 
у—1.8х 
or Byx=1.3 
х—0.7у 
or Bxy=.7 
r—VByxxBxy =V13X.7 
=\/91=--0.95 
Illustration —7 


Two lines of regression are given by 


x--2y—5—0 
and 2x-+3y—8=0 
and cee =12 


Calculate the value of x, y, су? and г. 
MEE с (M. A. Allahabad) 
x--2y—5-—0 2 x--3y—8—0 
or x-L2y—b5 (i) or 2.x--3y—8 (11) 
2x--4y—10 (i)x2 
2x--3y— 8 (1) 


угу —2 
Substituting the value of y in (i) Equation 


x--2y—b 
x--4 =5 
x—5—4—1 
xorx =1 


NS У, = 


Assuming that I equation stands for y on x 
and II equation stands for x on y 


then 


2x+4y—10 

4y— —2x--10 
y—-—.bx4- 2.5 
or Byx——.5 
and 2x+8y=8 
2х——8у--8 
x—-—1.5y--4 
Вху=— 1.5 

r—Ę/Bxyx Вух =V/—15x—5 

VEA 607 ЛБ 
r=-+.86 
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For calculating o y? 


Bxy=r 98.15 
су 


86 3.46 8 15 
оу 

1.5 оу=8 
су=2 
erm 


RATIO OF VARIATION 


Sometimes we are interested in the ratio of variation 
between two variables. In other words, we are interested to 
know that if subject changes by 1%, by what fraction of 1% 
would be the relative change ? Although there may be perfect 
correlation between two variables, yet proportional movements 
may not be the same. Ratio of variation is the average ratio 
of the percentage deviations from the mean in the relative (y) 
as compared with those in the subject x. Ratio of variation 
is a corollary of the coefficient of correlation. It is very 
helpful to know just what is the average ratio between the 
proportional or percentage deviation of the two curves from 
their respective types. The grain dealer is anxious to know 
how much the price of grain will be raised if the normal crop 
is 1095 short. 


There are two methods of finding out ratio of variation. 
(1) Mathematical and (2) Graphical. 


Mathematical Method of Computing Ratio of Variation. 
The various steps in computing ratio of variation mathematically 
are :— 


1. The first step is to determine which of the variables 
shall be taken as the subject and which shall be considered as 
relative. In the biological field, it usually makes little difference 
which series is chosen for each, but in studying the social 
sciences, the series having the larger average proportional 
deviations is taken as the subject. This is done in order that 
the ratio may be expressed less than unity. 


2. Calculate deviations of the various items of the series 
from their respective means. 


8. Deviations of the Relative are divided by g 
corresponding deviations of the subject. 
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4. Total the quotients. 
5. Divide them by the number of items to get the desired 


ratio of variation. 
Illustration—8 


From the following data ascertain the ratio of variation 
between sales and profits. 


Year Sales Profit Year Sales Profit 
(Lakh Rs.) (Lakh Rs.) (Lakh Rs.) (Lakh Rs.) 
1940 72 42 46 54 36 
1941 84 52 1947 68 38 
1942 66 48 1948 51 34 
1948 60 46 1949 57 42 
1944 48 30 1950 69 44 
1945 42 28 1951 54 40 
(M. А. Punjab) 
Year X | 4 (60)| Y |dy (40) ау/ da quotients 
1940 72| +12] 42) +2) + 2/--12 0.17 
1941 84 | 424 | 52| +12 | +12/+24 0.50 
1942 66 | 2 6] 48). eis у Ат 1.98 
1943 60 0| 46| +46 4-6/ 0 0.00 
1944 48| —12 | 80| —10 | —10/—12 0.88 
1945 49 | —18 | 28 | —12 | —12/—18 0.67 - 
1946 ваа в 100867 oce | ei 06 0.67 
1947 68| 13| 38| —2) —2/+ 3 —0.67 
1948 Bin 249 34| —6| —6/— 9 0.67 
1949 57| —8]| 42| 42| + 2/—8 —0.67 
1950 691.91 44| +4] d E? 0.44 
1951 54| —6| 40 0 0/— 6 0.00 - 
Total | 720 ' | 480 : 3.94 
Average 60 | 40 


(Note—In division like signs whether positive or negative 
give a positive result, Unlike signs give negative result) 
Ratio of Variation— pe —.88 

When there is а change of 1 in the subject the relative 
change is by .33. 7 

Graphic Method. When oscillations are not regular or 
arithmetic average is unsatisfactory, the best method of getting 
the ratio of variation is by means of Galton's graph. 


The method of drawing the graph may be stated as 
follows :— ` 
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(1) Find out the mean values of the two Series, 


(2) Convert the variables into indices -with the mean 
values of the respective series as the base. 


(3) Plot points with the help of indices taking the indices 
of the subject (x) on the vertical scale and of the relative (y) 
on the horizontal scale. 


(4) Draw а line of the ‘best fit’ keeping in mind that :— 
(a) The number of points on either side of this line is 
approximately equal. 


(b) These points on either side Should be equi-distant 
from the line. 


(c) The line should pass through the averages of the two 
Series. Since the two averages are equalled to 100, the line 
Should pass the point 100, 100,. 


In case there is perfect correlation between the two series, all 
the points lie on a Straight line or a well defined curve, To 
obtain the ratio of variation, a horizontal line should be drawn 
from any point in the ordinate which cuts the line of regression. 
The distance of this line upto the line of regression from the 
ordinate divided by the distance from this arbitrary point to 
the point where the regression line reaches the ordinate will 
give the required Ratio of Variation. 


Let us take the previous example to show the Ratio of 
Variation graphically. 


Year X Indices 37 Indices 
60—100 40—100 

1941 72 120 42 105 
1949 84 140 52 130 
1943 66 110 48 120 
1944 60 100 46 115 
1945 48 80 50 75 
1946 42 70 28 70 
1947 54 90 36 90 
1948 63 105 88 95 
1949 51 85 34 85 
1950 57 95 42 105 
1951 69 115 44 110 
1952 54 90 40 100 
Total 720 480 


Average 60 40 
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140 


100 


80 


20 
Я ум BC 
A Ratio of variation = — =— 
9 70 130 
Ratio of уана = E —0.34 
BA 50 


Ratio of regression is 1—Ratio of Variation. 
In this example 1—.34—.66 is the ratio of regression. 


Theoretical Questions 


1—Explain the concept of regression and ratio of variation 
and state their utility in the field of economic enquiries. 
(M. A. Punjab) 
2—Define regression. Why are there two regression lines 
when the coefficient of correlation is not unity. 
8—Explain with illustration or otherwise the meaning of the 
lerm regression equations. Prove that ‘r’ is the geometric mean 
between regression coefficients of y on x and x on y. 
4—Show that the coefficient of correlation is the geometric 
mean between the two regression coefficients. 
(M. A. Eco. Delhi) 


Practical Questions 
1—Given the following data, calculate the expected value of 
У when z is 12 and z when y is 20 ы 


Mean Value 10.5 25.5 
Standard Deviation 2.2 4.4 
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Coefficient of correlation between z and y—--0.9. 
(M. Com. B. H. U.) 


(Ans. y=28.2, 2—8.025) 


2—You are given the following results for the heights (x) 
and weights (y) of 1000 workers of a factory :— 


Mean height (а=) = 68.0 inches and ох= 2.5 inches 
Mean weight (ау)=150.0 Ib. and оу=20.0 lb. 
r=-+0.60 


Estimate from the above data (i) the height of a particular 
factory workers where weight is 200 lb. (ii) the weight of a 
particular factory worker who is 5 feet tall. 

(M. Com., B. H. U.) 


(Ans. Hight=71.75” and weight 111.6 №.) 


. 8—Find the most likely price in Bombay corresponding to 
the price of Rs. 70 at Calcutta from the following data :— 
Average Price Caleutta—65 , Bombay—67 
Standard Deviation Caleutta— 2.5, Bombay— 3.5 
т between the two prices=-+0.8 
(M. Com. Agra) 


(Ans. Rs. 72.6) 


4—Given the following data calculate the expected value of 
y when а is 12. 


X Y 
Average 7.6 14.8 
Standard Deviation 3.6 2.5 


r=-}.99 


(M. Com. Raj.) 
(Ans. 17.825) 


5—The ages of husband and wife in a community were found 
to have a correlation coefficient equal to --0.8 ; the average ages 
of husband and wife were 25 and 22 years and their standard 
deviation 4 and 5 years respectively. Draw the two lines of 
regression and estimate the expected age of husband when the 
wife's age is 12 years and the expected age of wife when 
husband's age is 33 years. 


(Ans. Husband's age 18.60, Wife's age 30 years) 


6—Determine the equation of the straight line which best fits 
the following data :— 


X—10 12 13 16 17 20 25 
Y—19 22 24 27 29 33 87 


(Ans. г on y—a— .812 у—6.013 
у on т==у:=1.213 a--7.705) 
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7—Given the following data for two tests :— 


History (x) English (y) 
x—75 у=70 
ox 6 oy= 8 

у—=--0.72 


(a) Work out the two regression equations. 


(b) Predict the probable grade in English of a student 
whose history marks are 65. 


(Ans. y=.96a—2, 2=.54y-+37.2 ; 60.4) 
8—Given— 


д==.85у, y=.8927,  z—8 
Find out оу and r. 
(Ans. y=8.07, r=+0.87) 


9—(a) What is meant by coefficient of concurrent deviations ? 
Calculate the coefficient from the following data :— 


Number of pairs of concurrent deviations=198 
Number of pairs of observations —840. 


(b) Given the following data, calculate the expected value 
of y when 2—12. 


X Yd 
Average 7.6 14.8 
Standard Deviation 3.6 2.5 
у—=- 0.99 


(M. Com. Alld. ; M. А. Raj.) 


(Ans. (а) -+.4059, (b) 17.827) 


10—The following data are given for marks in subjects 4 and 
В in a certain examination :— 


Mean Marks in A= 80.6 
Mean Marks т В—47.5 
Std. Dev. Marks їп A—10.8 
Std. Dev. Marks in B=16.8 


Coefficient of correlation between A and В=--0.42. 


Draw the two lines of regression and explain why there are 
two equations of regression. 


Also give the expectations of marks in subject ‘B’ for 
candidate wio secured 50 marks in subiect А. 
(М. А. Рип}. ; М. Com. Alld.) 


(Ans. 54.325, Equations are (i) y—47.5—. rer A 5) 
Gi) 2—39.5=.27(у—47.5) 
26 
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11—Plot a Galton's Graph from the following table and show 
the ratio of variation between Bank clearings and Immigrants for 
eight years. 


Year 4 Immigrants Bank clearings 
(tens of thousands) (in millions) 
1 79 49 
2 52 40 
3 33 25 
4 55 35 
5 46 85 
6 62 34 
T 31 34 
8 34 28 
Average 49 35 


(M. Com. Alld.) 
(Ans. Ratio of variation—.4 approx.) 


12— Calculate the coefficient of correlation between 2 and y 
and the regression equations from the following data : 


~ y Series т Series 
10—20 20—30 30—40. 40—50 50—60 Total 
20—30 4 6 10 — — 20 
30—40 2 5 9 4 — 20 
40—50 — 6 15 10 4 35 
50—60 — 1 T 12 8 23 
60—70 . Cm EIS 5 8 2 15 
"Total 6 18 46 34 9 113 


Ans. r—--0.537, y—44.38—.  7(2— 836.95) 
2—36.95=.413(y—44.38) 
13—In the following table are recorded data showing the test 


scores made by salesmen on an intelligence test and their 
weekly sales. 


СОЕ 


"l'est scores| 40 90 | 40 | 60| 60 


reds — 


Sales in | 
000 units} 2.51 6.0| 4.5) 5.0! 4.51 2.0] 5.5| 3.0| 4.5] 8.0 
Calculate the regression line of sales on test score and 
estimate the most probable weekly sales volume if a salesman 
makes a score of 70. What will be the sampling error of your 


estimate ? ў 
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(Ans. Bxy=.66, y70—4.65 


Sampling Error— хуа СҮ xy) = + .88 
с: 
N 


1:—-Caleulate the coefficient of correlation and obtain the 
lines о regression for the following data : 


Xi 1 2 8 & 5 6 7 8 9 
ys 9 8. 710 339 I3 18 ^ 14. 16^" 15 


Obtain an estimate of y which should correspond on the 


average to x—6.2 
(1. A. S.) 


(Ans. r=+.95, Regression equation y on x—y—.95x--7.25 
Regression equation x on у=х=.95у— 6.4 
value of y is equal to 13.14 when x=6.2) 


15—The following marks have been obtained by a class of 
students in statistics (out of 100) 


Paper I 80 45 55 56 58 60 65 68 70 75 85 
Paper II 82 56 50 48 60 62 64 65 70 74 90 


Compute the coefficient of correlation for the above data. 
Find the lines of regression and examine the relationship. 


(I. A. & А.5.) 
(Ans. r=+.91, y=.99x-+.9, х=.85у--9.5) 


СНАРТЕЕ 18 


INDEX NUMBERS 


Meaning and objects. An 'Index Number' is a device for 
comparing the general level of magnitude of a group of distinct 
but related variables in two or more situations. If we want to 
compare the price level in India in 1960 with what it was in 
1945, we shall have to consider a group of variables, such as the 
prices of wheat, cloth, vegetables etc. If all these variables 
change in exactly the same ratio, there will be no difficulty in 
finding out the change in the price level as a whole. But in 
practice the prices of different commiodities change in different 
ratios and in different directions. The difficulty arises when. 
these relative changes have to be averaged. What we want is 
one figure as an indieator or index of the change in the 
magnitude of the prices of different commodities as a whole, so 
that one can know the extent and direction of change. Thus 
an index number performs a function similar to that of an 
average. According to Mr. Blair “Index Numbers are а 
specialised type of average" In the words of Croxton and 
‘Cowden. “Index Numbers are devices for measuring differences 
in the magnitude of a group of related variables.” 


According to Fisher the purpose of an index number is, 
“that it shall fairly represent, so far as one single figure can, 
the general trend of the main diverging ratios from which it is 
calculated.” Accordingly the aim of index numbers is to 
provide a basis. for comparison of different aspects of a 
phenomena over a certain period of time. “They are particularly 
valuable where it is desired to compare complicated nature of 
the data or the imperfection of the knowledge concerning it.” 
The index number is thus designed to show relative change or 
difference of a group of related variables. 


The technique of index numbers is helpful in the study of 
changes, which cannot be measured directly. According 
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to Dr. A.L. Bowley, "Index Numbers are used to measure the 
changes in some quantity which we cannot observe directly." 
Index Numbers are meant for comparison. They may compare 
(i) changes occuring over time (ii) differences between places 
and (iii) differences between like categories. 


The credit for developing the technique of Index Numbers 
goes to Carli. He constructed index numbers in 1774 in order to 
find out changes in the purchasing power of money. 
W. 8. Jevons, Dutot, Marshall, Irving Fisher, Walsh and other 
economists also used this technique for the same purpose. Ever 
since index numbers were first compiled, their use has greatly 
and steadily increased and their usefulness has been amply 
proved. In modern times, this technique is applied not only to 
measure the change in the value of money or price level, but also 
to measure changes in production, Industrial Activity, cost of 
living, Industrial Profits, Foreign trade and similar other 
economic facts. Index Numbers are rightly called ‘Economic 
Barometers.’ 


Statistical Aspects of Index Numbers. The statistical 
technique of the construction of index numbers involves the 
following processes :— 

1. Definition of the purpose. 

Selection of the items or *Regimen', 

3. Selection of sources of data. 

4. Collection of data 

5. Selection of base 

6. Form of average to be used 

7. System of weighting. 

1— Definition of the Purpose—Before actually constructing 
an Index Number it is very necessary to define very clearly its 
purpose, There is no all-purpose index number. It is important 
to know beforehand, what we are trying to measure and also 
how we intend to use our measures. Index numbers are 
specialised tools and as such are more efficient and useful when 
properly used. The selection of items ete. will depend upon the 
purpose of construction of Index numbers. If it is desired to 
construct a Cost of Living Index Number of the labour class, then 
only those items will be included, which are required by the 
labour class. 

2—Selection of the items or ‘Regimen’—The list of commo- 
dities included in an index number is called the “Regimen”. All 
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index numbers are designed to measure particular groups of 
related changes. But an index does not cover all such changes, 
merely a selection of changes is taken into account. All the more 
important commodities should be included so that regimen may 
become as representative as possible. Selection of the items 
depend upon the purpose of construction of index number. The 
following broad factors must be given due consideration while 
selecting items :— 


(a) The items should be representative—It means that a 
sufficiently large sample of relevant items must be selected to 
obtain reliable index numbers. If a cost of Living index number 
is to be caleulated, the items to be included therein must 
represent consumption habit of that class of people. Ап index of 
cost of living necessitates the inclusion of not only food articles, 
but also of rent, electricity, clothing, transportation, Medical, 
Education and so forth. The most satisfactory way of 
accomplishing this is to divide the items into groups and sub- 
groups and to draw a representative sample from each of them. 
Tn selecting the commodities from a group, it is desirable to pick 
up those, which tend to conform most closely to the central 
tendency of the group. 


(b) The items should be of a standard quality—The purpose 
of an index number is comparison. Hence the standard quality 
of different items should be included. Because only standard 
grades of the same commodity are comparable between different 
dates. For example in the construction of an index number 
under ‘food group’, wheat may be an item. There are a number 
of qualities of wheat. Only that quality should be included 
which is in common demand. Besides this, every time only that 
quality should be selected ; so that comparison may be made. 

(с) Non-Tangible items should be excluded—it is difficult to 
ascertain the values or prices of non-tangible items. Hence they 
should not find a place in ‘regimen’. Such items are personal 
services, goodwill etc. 


(d) The Number of items—The another point of importance 
is to determine the number of items to be taken. The number of 
items will depend upon the technique of collection and processing 
of data. With the development of electronic computing 
machines, there is a tendency to include larger number of items. 
No definite limit can be laid down for this. Different agencies 
have taken different numbers of items. Babson is satisfied with 
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10 items only while in the list of U. S. Bureau of Labour 
Statisties the number is as high as 450. Under modern 
conditions with inereasing demand for commodities and better 
standard of living, there is tendency among agencies to include 
a large number of items. The larger is the number of items, 
the lesser will be the chances of error in the average, We must, 
however, have a manageable number and also to aim at 
reasonable standard of accuracy. 

3—Selection of Sources of data—When selecting the sources 
of data for index numbers, we may rely on regularly published 
quotations or obtain periodic special reports from the merchants, 
producers, exporters or others who possess the basic information 
needed, Under either circumstance, we must make sure that the 
data pertain strictly to the thing being measured. We should 
see that the data obtained are accurate, comparable, representa- 
tive and adequate for the purpose. 

4 Collection of Data—1n the collection of quotations atten- 
tion be paid to the following considerations :— 


(a) The method of quoting price should be specifically 
determined. There are two methods of quoting prices (i) 
Money prices—in which prices are quoted per unit of a commo- 
dity e.g. wheat at Rs. 50/- per Quintal (100 Kilograms).. (ii) 
Quantity Price—in which prices are quoted per unit of money 
e.g. Wheat 2 Kilograms а rupee. The former method is more 
logical. ' 


(b) Another consideration is with regard to type of 
quotation, whether wholesale or retail should be obtained. A 
wholesale price index or general index requires wholesale price 
quotation. But for constructing cost of living index number, 
retail price quotations are desirable. 


(с) Price quotations should be obtained from important 
markets. Since it is neither possible nor necessary to collect the 
price of a commodity from all the markets where it is bought. 
.and sold, we should take a sample of the markets also, In 
selecting а sample of markets care should be taken to see that 
the markets included are such as are well known for trading in 
that partieular commodity. ; 

(d) Next thing is to select an agency from whom price 
quotations have to be obtained. We should try to select an 
agency which may be most reliable. То check the accuracy of 
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price quotations supplied by an agency it is desirable to obtain 
Such quotations from more than one reporting agency. 


(e) In order to ensure better results it is advisable to take 
a ‘Standard price’, which implies representative price of a 
commodity for whole interval under consideration. It should be 
neither first of the month or week or day quotation, nor middle of 
the month or week or day quotation, nor end of the month or week 
or day quotation. If it is a weekly index, it would be better to 
collect quotations for all the days in the week and take an average 
thereof. For monthly index number average of weekly quotations 
may be taken. As regards yearly index number, an average of 
monthly quotations will serve the purpose. This will not allow 
abnormal fluctuations to creep in. 


(f) Due care must be taken in selecting the enumerators 
who shall be responsible for collection of data. Upon their 
quality and intelligence will depend the quality and reliability of 
index number. 


5—Selection of Base—iIn the construction of index numbers 
the selection of a proper base is very important. Every index 
number must have a base, a statistical hitching post from which 
to express the change. The base should be recent and normal 
às far as possible. There should be no abnormal conditions in 
that base year. There are four types of base periods :— 

(1) Fixed Base 

(2) Average Base 

(3) Chain Base 

(4) Laspeyre's method. 

(i) Fixed Base Method—According to this method a year is 
taken as base. Prices during that year are taken as equal to 100 
and prices of following years are shown as percentages of those 
prices of the base year. If Index Numbers are constructed for 
1950, 1951 and so on, keeping the year 1948 as base for all these 
years, then price level of different years are comparable to the 
price level of 1948. In this method саге should be taken in 
selecting the base year. It should be normal and not abnormal. 
Normal year means that it is free from the effect of business 

-eycles—boom or depression, and there are no special cireumstan- 
ces like war, famine etc. 


2—Average Base Method—It is often true that no one year 
is sufficiently normal to be a good basis of comparison. Thus 
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an average of several years is usually a better base. Taking the 
average prices of a period of years help in reducing the effect of 
abnormalities to a very great extent. The period selected should 
be of atleast five years if not more and then the prices of this 
period should be averaged to serve as base. 


3 Chain Base Method—1f comparison is desired from year 
to year a system of chain base is used. According to this method 
relatives for each year are worked out on the prices of the 
preceding year. The chain base index numbers are also called 
Link Index numbers. For example if index numbers are to be 
constructed for 1950, 1951, 1952 and 1953, then for 1951, 1950 
will be the base, and for 1952, 1951 will be the base and so on. 

4—Laspeyre System—Although a particular base may be 
satisfactory for a number of years, that base becomes less 
meaningful as time passes, and it eventually becomes desirable 
to shift to a more recent period. The shift is desirable for the 
following reasons (i) the dispersion of price relatives may become 
so great that no average is reliable ; (ii) because of permanent 
currency depreciation, growth of population, technological develop- 
ments and other reasons, new and higher levels may have been 
attained by income, prices, production and consumption ; (iii) 
the pattern of consumption may change to such an extent. that 
no aggregate of commodities can be found which includes the 
major expenditures common to both periods ; (iv) the quality of 
many commodities, nominally the same, changes progressively 
with time. The tendency now a days is to keep the Laspeyre 
type of index uptodate by regular revision of base year. 
According to Laspeyre a time comes when the base year must 
be revised. 

6—Form of Average to be used—For constructing an index 
number any form of average such as mean, median, mode, 
geometric mean and harmonie mean сап be used. From the 
practical point of view median and mode are unsuitable because 
of their being erratic. The geometric mean and harmonic mean 
are difficult to calculate, hence arithmetic average is used. 
Though with the development of the use of electronic computors, 
the use of geometric mean is also pecoming popular. 

There are two methods of constructing index numbers : (1) 


by computing aggregate values ; (2) by averaging relatives. 


Aggregate Method —Broadstreet agency adopted this method. 
Under this method prices of different commodities per unit are 
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taken for the years under consideration and their totals are then — Ч 
compared. In this method there will be no base year, and the _ 
prices of any two years can be compared simply by taking totals - 


of commodity prices. All items are given equal importance, — 


The aggregate method obtains the result directly and produces a 
result that has a simple and clear meaning. But it is not a 
trustworthy method. 


Averaging of relatives method—According to this method _ 


prices of various commodities are turned into percentages by 
taking any particular year as the base. Then these percentages 
are totalled and averaged. 


7—System of weighting—tIn order to allow each commodity 
to have a reasonable influence on the index, it is advisable to use у 
a suitable weighting system. In case of an unweighted index | 
number of prices, all commodities are given equal importance. 


But in actual practice different commodities command a different _ 


degree of importance. In an unweighted index number if price 
of wheat is doubled and that of tobacco is halved, there may not 
be any change in it. But increase in the priee of wheat will 
affect adversely than the benefit which will accrue to the consumer 
by the reduction in tobacco price. That is the main reason for 
assigning weights to the various items while constructing index 
numbers. The system of weighting depends on the purpose of 
index, but they ought to reflect. the relative importance of the 
commodities in the regimen in the relevant sense. The system 
of weighting may be either arbitrary or rational. Arbitrary or 
chance weighting means that statistician is free to assign weights 
to different items as he thinks fit or reasonable. Rational or 
logical weighting means thiat some criteria has been fixed for 
assigning weights. The weightage may be according to :— 


G) The value or quantity produced 
(ii) The value or quantity consumed 


(iii) The value or quantity sold or put for sale, It is 
advisable that for constructing cost of living index 
number weights should be according to the consump- 
tion and for general index number weights should be 
according to production. у 


Weights may be either (i) Implicit weights or (ii) Explicit 


weights. ! 
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Implicit weighting iniplies the inclusion of а commodity or 
its variety in the index number more than one time, e.g. if 
wheat has to be assigned thrice as much weight as other commo- 
dities, then three varieties of wheat may be included in the 
regimen as against one of other commodities. Explicit weighting 
implies that weights are expressly laid down on the basis of some 
outword evidence of importance of items. What this evidence 
should be is a different problem to decide. 


Another problem with regard to the system of weighting is 
whether weights should be fixed or fluctuating. In fact, the 
relative importance of the different commodities is constantly 
changing. If weights are allowed to vary from period to period, 
index number will give better results. Such an index number 
gives an idea not only of changes in the prices but also of 
shifts in importance. 


Methods of Weighting. No system of weighting should be 
used which is not logical The following. are the different 
methods of weighting used by statisticians. | 


(A) Weighted Average of Relatives Method—This method is 
also called as the ‘Family Budget Method’. In this method 
weights are values (Price X quantity) of the base year. 
Symbolically it is expressed as :— 


T ]—Relatives 


У 
Index Number— Sv V Values 


(B) Base period quantities as weights—This method is 
also known as Laspeyre’s method. The base year's quantities are 
used as weights. Symbolically н 


3 


Р 
Index Number— 219. 100 
XPodo 
Where P,— Price of the current year 


P,—Price of the base year 
qo Quantity of the base year. 


(C) Current year's quantities \as weights—This method is 
also known as Paasche's method. In this method current year 
quantities are used as weights. This method involves the 
selection of a new set of weights each year. Symbolically 


хР:@ 


Index Number— —5—— 
: У Роба 
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Where P,—Price of the current year 
Po=Price of the base year 
q,—quantity of the current year. 


(D) Marshall-Edgeworth’s Method—In this method average 
(ог total) of the quantities of the current and base years are 
used as weights. This is a compromise solution used by two 
distinguished English economists—Marshall and Edgeworth. 
The formula is 


XEPi(qo4-di) 


Index Number— — — — — 
XPo(qo-4-di) 


(E) Average of the quantities for all the years which the 
index numbers include may also be used as weights. Though it 
is an excellent solution for a historical study, but this method is 
impracticable. 


(F) Average of the quantities of several years which are 
thought to be typical may be used as weights. 


(G) Highest common Factor of the quantities as the weights 
—tThis ingenious device has been suggested by J. M. Keynes to 
avoid the sort of bias inherent in methods (B) and (C). 
According to this method weights are quantities of each commo- 
dity common to each year either to the base and the given year, 
or to all the years under consideration. 


(Н) Fisher's Method—According to this method two index 
numbers with a different set of weights are constructed and a 
geometric average is found out. This method is also called 
Crossed-Weight formula. The formula is 


XPiqo , XPiq 
Index Number— 4/2100 у 25191 100 
EXPodo Ро ^ 


Where P,—Price of the current year 
Po=Price of the base year 


q;—quantity of the current year 
qo—quantity of the base year. 


It is frequently called Fisher's ‘Ideal Index Number’, because 
it conforms to certain tests of consistent behaviours, which 
Irving Fisher considered appropriate. 
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(I) Drobisch and Bowley Formula—It is similar to Fisher's 
method, but instead of taking geometric mean, arithmetic average 


is calculated. 


P 
Index Number— Ce ut Pit ja 
X Podo Роб 


(J) Walsh's Formula 
xv/do(a P3) 
xv/ao(d: Po) 


Construction of Various Types of Index Numbers 
Calculation of price relatives 


(i) Under Fixed Base Method 


Index Number— 


T Р 
Relative— бш yere lor x100 or 5 x100 
D 


Base year's price 
(ii) Under Chain Base Method 


: : Current year's price 
Link Relative of the current year— B ; = 
Previous year's Price 


Illustration—1 

Find out the index numbers for 1958, 1959 and 1960 taking 
1950 as the base year by (i) Aggregative and (ii) Relative 
methods. 


Commodity 1950 1958 1959 1960 
А 2 5 4 3 
B 8 11 13 6 
С 4 5 6 8 
D 6 4 5 7 


= 

oO 

> 
о 
wo 
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Index Numbers by Aggregate Method 
(Рх!) 29 
P1968 = _100=116. 
So х100 5 х100=116.0 
X(Px3) 


34 
р 1959— 202 2) у100—22__100— 136, 
59— S po) 100=; »(100— 136.0 


so EP) To i 
P 1960— (ро) ЖОО 25—Ж100=1080 


Index Number by Relative Methods 
Sum of Relatives 659.2 


р 1958— * S 
| РВ Number of items ^ 5 18084 
| 
Р 1959— # $ TS 143.16 
P 1960— И „= 1204 


Illustration—2 


The following table gives the average wholesale prices of 
the commodities A,B,C,D,E, during the years 1957, 1958, 1959 
and 1960. Find out the index Numbers by chain base method :— 


Average wholesale prices in the years 


Commodities 
1957 1958 1959 1960 
A. 30 32 36 40 
B 20 22 25 30 
С 40 44 46 50 
D 45 50 52 55 
Е 55 60 61 65 
Relatives on the basis of preceeding year 
1957 1958 1959 1960 
Commo- A 
dities Rela- Rela- Rela- Я Rela- 
Prices| tives | Prices tives | Prices| tives Prices| tives 
A 80 100 82 106 | 36 118 40 111 
В 20 100 22 110 25 114 80 120 
C 40 100 44 110 46 105 50 106 
D 45 100 50 111 52 104 55 106 
E 55 100 60 109 61 102 65 107 
Total 500 546 538 550 
Index 100 109 108 110 


Number 
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Illustration—38 


Prepare Index Numbers of prices for three years with 
average prices as base :— 


Rate Per rupee 


Wheat Cotton Oil 
I Year 10 Srs. & Sys, 87 Srs. 
II Year 9 Srs. 315 Srs. 8 Srs. 
III Year 9 Srs. 8 Srs. 21% Srs. 
p | | 
Commo- FD First Year | Second Year | Third Year 
dities Unit| 3.2 m 
A Price| Relatives Price| Relatives Price Relatives 
Per | 102 | 44 102 
Wheat ма. | 43| 100] 4 | 93 4.4 
98 118.8 115 
Cotton » 111.6 | 100 | 10 86 | 114 
| 94 16 118 
Oil » 14.2 | 100 [13.3 | 94 13.3 
‘Total of | 278 904 330 
Relatives | 
= | - 
Average of 91 | 98 110 
Relatives | 


Note—The Prices should first be converted into rupees per 
md. Then average should be calculated. 
Illustration —4 


Find out the index numbers for 1958, 1959 and 1960 based 
on 1950 using Arithmetic Mean, Median and Geometric mean :— 


Commodity 1950 1958 1959 1960 
A 8.75 7.50 5.00 6.00 
B 2.50 3.00 4.00 8.25 
C 3.00 4.50 2.00 2.50 
D 2.00 2.00 3.00 4.00 


Е 4.25 3.75 4.00 5.00 
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| 1950 | 1958 | 1959 | 1960 


Relatives|Price | Relatives 
| | pote ent 

A |815 100 |750| 200 | 5:00 133 | 6.00| 160 
B |250 100 | 3.00) 120 4.00) 160 | 3.25| 130 
с 
р 


j | "m 
dity |Price| Relatives Price Relatives Price 


| 8.00 100 4.50 150 | 2.00 67 2.50 88 
2.00 100 2.00 100 e 150 4.00 200 


E 4.25 100 | 9.75 88 4.00 94 5.00 118 
"Total of | 
Rela- 500 658 604 169 
tives 
| 
А. 
Average 100 | 131.6 120.8 138.2 
| ——— 
Mediam 100 | 120 133 130 
1 
G.M. 100 | 125.9 114.8 182.4 


Illustration—5 


Compute the weighted index numbers from the following 
prices of commodities A,B and C. 


Commodity Weights 1958 1959 1960 
A 10 5 4 7 
В 6 & 6 5 
C 4 6 8 9 


(Computation by Aggregate Method) 


Commo- & 1958 1959 1960 
dit CX EA MI рас qi PENES RUN 
J| Е [po|(PoW)| Px (PW) | Pxo | (Px2W) 
В ie ШЕ E area PENE E > ИЕТ 
А 10 5 50 4 40 7 70 
В 6 | 4 24 6 36 | 5 30 
c а. 32 9 88 
(Жыкы ула Исина E 
Total 98 108 | 136 
z(Pxuw) 100 
I. No (1959) => (py) x 
108 
———x100 —110.2 
98 < 1 
3 (Pxow) 
: ER DENR A 
I. No (1960) = (paw) x 


136 
mE —138.8 
98 x100 
27 
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(Computation by Relative Method) 


1958 1959 1960 


Commo- = ЕЕ 
айу |Weights| Ро Rela-| Px; [Relative| RXW] Рх» {Relative| RXW 
| tive 


10 5 100, 4 80 800 | 7 140 1400 


A 
B 6 4 |100 6 150 900 | 5 125 750 
с 4 6 10 8 188 533 | 9 150 600 . 
Total| 90 | 2233 2750. 
praxe) 2233 
I. No (1959) — 7—————— = а у ^ 11166 
lat; 
I. № (1960)— ss ХУ 2060 _ i975 
ум 20 
Illustration—6 


Construct Index Number for the year 1958 on the basis of 
the year 1950 of the following by Fisher's Ideal formula. 


[ 1950 1958 
Article 
po | Ф Pi qi родо podi pido Р191 
I 5 |10 | 4 ш 50 60 40 48 
II 8 6 |7 7 48 56 42 49 
III 6 8 5 4 18 24 15 20 
Total | 116 | 140 97 | 117 


I. No. for 1958 by Fisher's ideal formula— 
P. УР. 

mm Жар X100 

zs 97 x T 

116 * 140 

—\/83Х.68х100 

=.723<100 

—72.3 approx. 


Tests of Adequacy of Index Number Formula 


І. No= 


Professor Irving Fisher in his famous work “The Making 
of Index Numbers” has done commendable work in finding a 
scientific formula for Index Numbers. Fisher has suggested two 
tests which he believes should be met by an index number formula. 
These tests are derived by analogy with the behaviour of 
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individual prices. If we take a single commodity using period 
‘o’ as the base, its price movement between period “о” and period 
1 is measured by ratio a On the other hand using period 1 аз 
base, Из price movement between period 1 and period 0 is 
measured by Ро. Непсе 

1 


1 
Po _ Р; 
рга G 


By analogy, the index number Ро measures movements 
between period 0 and period 1, and the index number Pj, 
measures price movements between period 1 and period 0. 
Accordingly 


Poi XPip=1 
This is called the ‘Time Reversal Text’. In Р;о the time is 
reversed. This test is tantamount to saying that a formula 
which shows a rise in prices of say 25 percent between period 0 


and period 1, should show a fall in prices of 20 per cent between 
period 1 and 0. 
(1.25.80) —1 
The second test is known as ‘Factor Reversal Test’. 
Fisher says, “Just as our formula should permit the inter- 
change of the two times without giving inconsistent results, so it 
ought to permit interchanging the prices and quantities without 
giving inconsistent results—i.e. the two results multiplied 
together should give the true value ratio.” In simple words, zd 
changes in the price multiplied by the changes in quantity should 
be equal to the total change in value. If the price of a commodity 
has doubled during a certain period with reference to base year 
and in this period quantity has increased four times, the total 
change in the value should be (2X4)=8 times the former level. 
If P, and P, represent the prices and q; and do the quantities of 
the current and the base years respectively, and if Por represents 
the change in the price in the current year and qo; the change 
in the quantity in the current year then 


ХР; 
PoXdo— оао 
От 
122 BE 
Ри=р— and d= o 
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P. x & ХР: 


Ро qo —XPodo 


To Prove it— 


— «Ро „ЭР 
01— ADOS 
BPodo X Pods 
z P 
and qo = Podi z i 


Род X zPige 
SPido  XP;q;  ZPodi ‚ХР: 
PoiXdoi— 1 ха 
NDS. ZPodo x XPod, ^ ZXPodo — XPido 
= [8219 „ЭР _хЁР 


ZPodo  SPodo — ZPodo 


. Fisher's Ideal formula of index numbers satisfies this 
Factor Reversal Test also. However as Boddington puts it, 
“Unfortunately, while this formula apparently meets most of the 
mathematical requirements of a perfect index-formula, it is 
‘objected to, on the score that it is not clear what it measures i.e. 
the result combines both price and volume changes, when usually 
we want the one to be separated from the other." 


Another test of adequacy of Index Number formula is 
‘Circular Test’. This test is met by practically none of the index 
number formula including Fisher’s. It is an extension of time 
reversal test. Suppose an index number is computed for the 
period 1 on the base period 0, and another index number is 
computed for 0 on the base period 2, then it should be possible 
to get directly an index number for period 1 on the base of 
period 2. If index number calculated thus gives inconsistent 
result then circular test is satisfied. If 


Po,—Index No for the current year based an ‘0’ period 
P,,—Index No for the base ‘0’ on the base of period 2 
P,4—Index No for the current year based on period 2 
then 


Ро=Ро X Pio 
Or 
Ро: ХР. 
ба 1 or Pax Pax Pol 
Poz 
It is argued that if a price index between periods ‘0’ and 
1 has risen to ‘x’ and between periods 1 and 2 to ‘y’, then between 
periods 0 and 2 it should have risen to ‘xy’. As an index number 
has meaning only in terms of the system of weighting adopted, 
this requirement is not reasonable. 
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Illustration—7 
Prove how Two tests of Factor Reversal and T'ime reversal 
are satisfied by Fisher’s Formula. 


|' 1958 | 1960 | 

Article | родо | Pod | pado padi 
| Po qo | Pa | 9 | | 

кейе, кес. | 

1 = 10 | 4 [1 50 60 [s 48 

TI 8 | endi Е.А 48 56 49 

II^ ae 4 |5 8 24 | 18 | 1 15 

Total |) > dose 122 | 184 | 102 | 112 


Time Reversal Test—Po; X P19—1 
Pido SPido ЖР A qu 17:112 


Pa= Аср, х Роа 122 Х 134 
Р «ЕР. „2° EPodo _ 134 x 122 
10 Np х ХР, — M2 102 
n M T TE 
jog 112 194. 
Рахо Nis х тва 112 “02 М 


Hence T.R.T. is Eu 


zPa 
Factor Reversal Test— Ро 915 P; E. 


р. — «РР: Ра: _ 102 ил 

= NsPoqo > zPodi M22 х 

doz «РОН „Раб = ша 

o= Мурду "xzP,qo  \122 102 
_ „02 112 134 HR 12 112 

PaXdu— i52 * 134 * 122 "102 ~ W122 “122 

_ 112 ХР: 
100; = Род 


Hence F.R.T. is also satisfied 
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v'6I9 y'€69 | POLE 0848 0881) 0061| FFEIL, 0981 TPL 
$£'40I VISI OZSIL 08221 | 809 616 587 | OST | 885 | 058 | 9€ [31 07 8 Я 
878 666 0054 0798 073 879 885 | 098 | 075 | 008 | #2 ZL 0$ от а 
S'GTI 6'9TT 00705 | 00916 | OST 064 098 | 098 | OFS | OFZ | 09 9 09 ӯ 1) 
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(A) Laspeyre's Method— 


ХР, 1900 
Ри K100==— == 
Ра, х100 1860 х100=139.1 
(8) Paasche's Method— 
.XPiud . 1880 P 
Ра р, Х100= {еүү-Х100— 139.8 
(C) Marshall-Edgeworth's Method— 
zPi(qo4-di) 3780 
Рак Xl a = 
oP, (abt) x100 2704 <100=139.8 


(D) Fishers Method— 


Р УР 
Ри 422140 Pus 
ry zPodo Роб 199 
1900. 1880. 
= [1900 . 1880 og 
Visas 180471001399 
(E) Drobisch and Bowley’s Method— 
BPido , Padi | А 
Р ы 21101 | 25100 
01 ixPodo | EPodi 5 
1900 1880 
7360 * 1344 
—140 Approx. 
(F) Walsh's Method— 


P _3 Vaol Pa) 


J-2xa00 


с M ) 5100 
E BV/do(diPo) 2 
698.4 
~ 194-100 =1124 


We have seen that there are a large number of index 
number formulae available. Applied to the same situations 
these formulae generally give different results, although in 
practice weighted index numbers do not differ greatly unless the 
dispersion of individual price movements is substantial. However, 
there is no а ‘best’ index number formula. Each formula has a 
precise meaning, most of them measure the change in the 
aggregate cost of a certain collection of goods. The important 
question is: Does a particular formula measure, what we 
want to measure ? Each formula measures something, but is it 
the thing we want ? 

Consumer Price Index Number. Till recently, this was 
known as *Cost of Living Index Number'. The change in nomen- 
elature has been adopted not only in our country, but also almost 
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in all other countries. Original suggestion to change the name 
was made by a special committee of American Statistical 
Association in 1944. . The committee remarked, "the index was 
not actually to measure changes in the cost of living, that the title 
was a misnomer, that the index had been improperly defined." 
This change has been made because the term ‘cost of living 
index number’ is misleading. This type of index does not 
measure the actual eost of living, neither the fluctuations in the 
cost of living due to causes other than change in the price-level. 
Due to reasons other than financial, there may be a change in the 
dietary habits of a group of people, and this may result in а 
change in the cost of living. For example, during the period of 
rationing of cereals, people have to consume other grains also 
besides wheat and rice. "This resulted in the change in the 
composition of diet of the people and consequently there might 
have been some change in cost of living. Such changes, however, 
are not sought to be measured by the cost of living index number. 
Consequently an appropriate name has been given to so far 
called eost of living Index Number. It is now called Consumer 
Price Index Number. New name is closer to what the index 
actually measures. 

The consumer Price Index Numbers seek to measure, over 
time, the change in the cost of maintaining unchanged pattern of 
living of a partieular group of persons, that is, the change in 
the cost: of consuming fixed quantities and qualities of goods and 
serviees. Obviously this index must relate to a particular class 
of persons who have got a similar pattern of living and to a 
definite locality, because there is difference in the pattern of 
living from locality to locality and from one class to another. 
The working class has got a different pattern of living than the 
middle class, the middle from the richer class. Similarly the 
pattern of living of the working class in Caleutta will be quite 
different from that of the same class at Bombay or Madras. 

So a particular consumer price Index Number relates to a 
particular class of people having similar consumption habits and 
pattern, and to a definite region with more or less economic 
homogeneity. Moreover, since the index numbers relate to 
а particular pattern of expenditure, it has reference to the period 
in which that pattern was prevailing. "This period is called the 
base period. 


A period of comparative economie stability should be selected 
as the base period, so that the consumption pattern which is 


INDEX NUMBERS 425. 


reflected in the index number remains practically the same over 
a fairly long period. This consumption pattern can be ascertained 
through a family budget enquiry. It is necessary that the: 
family budget enquiry amongst the class of people to whom the 
index series is applicable should be conducted during the base 
period. The 6th International Conference of Labour Statisticians 
held in Geneva in 1946 is of the opinion that the period of the 
enquiry of family budgets and the base periods should be 
identical as far as practicable. Normally the base period should’ 
be one complete year, so that the element of fluctuation due to 
change of season may be eliminated. 


The commodities which are to be included in the index will 
have to be selected from the standard or average family budget,. 
that will be obtained from the family budget enquiry. A family 


' budget is the detailed statement of expenditure the family has to 


incur for living as а family. Such budgets are obtained through 
a sample survey and an average budget is prepared which is the 
standard budget for that class of people. The goods and services: 
that enter into the standard budget should as for as practicable- 
be included in the index. 


The commodities may be divided into five broad groups, 
which may be (1) Food (2) Fuel and Lighting (3) Clothing and 
Footwear (4) Housing and (5) Miscellaneous. Percentage 
expenditures on the different items constitute the individual 
weights allocated to the corresponding price-relative and’ 
percentage expenditures on the five’ groups constitute the: 
group weights. 


The Consumer Price Index Number is calculated by using 
either Laspeyere’s formula, that is 
BP ido 
I. No— 2-1 «100 
BP odo © 
or by Family budget method that is 


zIV 


Ti №5 y 


Illustration—9 


From the data given below calculate the cost of living index 
number for the current year by the Aggregate Expenditure 
method and Family Budget Method separately :— 
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Article Quantity consumed | Unit Pricein | Pricein 
in base year | base year | current 
year 
Rice 5 mds. md 6 8 
Millets 5 mds. E 4 5 
"Wheat 1 md. Ж 5 10 
Gram 1 md. i 3 6 
Arhar $ md. Lo 4 6 
Other Pulses 2 mds. » 3 4 
‘Ghee 4 seers. sr. 1.25 2 
Gur 2 mds. md 2.50 5 
Salt 12% seers. hig a 4 5 
Oil 24 seers. Dis 20 25 
Clothing 40 yards. yd. 25 5 
Firewood 10 mds. md. .50 8 
Kerosene 1 Tin Tin 4 6 
Houserent 1 House H. 12 15 { 
(Aggregate Expenditure Method) 
A E Hh sí 
[^ @ 
33 3 яй | ae ВР. | Be 
Axa | 357 ja $7 | ga | 23 | 5j 
gn о ^ =] Do 
| à 83 AZ | Д8 Aa E» 
родо pido 
Rice 30 ES 40 
Millets 20 25 
Wheat 5 10 
Gram 3 6 
Arhar 2 3 
‘Other 
Pulses 6 8 
Ghee 5 8 
Gur 5 10 
Salt 1.25 1.56 
‘Oil 12 15 
Clothing 10 20 
Firewood 5 8 
Кегозепе 4 6 
Houserent 12 15 
"Total 120.25 175.56 


I. No for the current year— 


INDEX NUMBERS 427 


ZPigo 
—5 —x100 
ZPodo xs 
175.56 1 
—io038- х100=146 


(Family Budget Method) 


КЕР ie 
m RIS 2A 
Article | "3 S ®| Unit | base 
Ses year 
93.3 
| ms 
po 
1 | Rs. 
Rice 5 mds. | md. | 6 
Millets 5 mds.| ,, 4 
Wheat rma us 5 
Gram тана: |. 5. 3 
Arhar $ ша. | ,, E 
Other 
Pulses 2»ds.| , |3 
Ghee 4515. | seer | 1.25 
Gur 2 mds. md. | 2.50 
Salt 123srs. | 4, | 4 
Oil 24 хз. | „ 90 
Clothing |10yds.|yd. | .25 
Firewood |10 mds. | md. 50 
Kerosene 1Tin |Tin | 4 
Houserent | 1 House| Н. E 


E N © 
Price | Price 8S |. Е 2 
in |relatives| „ ° = P EN 
current | for = Ss E Š 8 
year |curreni | = 8| S.S Е 
year ЕВ АДЕ 8 
pı I y IV 
Rs 
8 133.8 30 3999 
5 125 20 2500 
10 200 5 1000 
6 200 3 600 
6 150 2 300 
799. 8 
4 188.8 6 
2 160 5 800 
5 200 5 1000 
5 125 1.25 156.25 
25 125 12 1500 
.5 | 200 10 2000 
.8 160 5 800 
6 150 4 600 
15 125 12 1500 
120.25 17555.05 


Index number for the current year— 


x (Price Relativesx Weights) 
xWeights x^ 

УТУ 1755.505 — 

РУ 1712055 © 


(Weights will be caleulated by caleulating the total expendi- 
ture in the base year.) 
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Iilustration—10 


Construct cost of living Index for April 1960 from the 
following data :— 


Weights proportionate| Group Index No. 


Groups to total expenditure for April 1960 
Food 47 247 
Fuel and lighting 7 293 
Clothing 8 289 
House rent 18 100 
Misc. 14 236 

Index No. | Weights | Weighted Relatives 

Groups T IV 
Food 247 AT 11,609 
Fuel. and lighting 293 7 2,051 
Clothing 289 8 2,312 
House rent 100 13 1,300 
Mise. 236 14 3,304 
Total 89 20,576 
The cost of living Index Koa па а Ор 

У 89 


Miscellaneous Problems Regarding Construction of 
Index Numbers 


(i) Change from fixed base index numbers into chain base 
index numbers. 


Illustration —11 


From the fixed base index numbers given below prepare 
chain base index numbers. 


Year 1950 1951 1952 1953 1954 1955 
I. No. 267 275 280 290 320 280 


To construct chain base index numbers from the index 
numbers given based on fixed base system we have to proceed 
as such—(1950 is taken as base) 
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Fixed Base Chain Base 
Year Index No. Conversion Index No. 
1950 267 -R 100 
1951 275 275 (100 160 
567 арргох. 
280 
1952 280 275 100 101 EA 
290 
1953 290 "280^ 100 103 4j 
320 
1954 320 —— E 
5 290 X100 110.4 ,, 
| 280 
1 | 2 == d 
955 80 sog X100 87.5 о» 


| 


(ii) Change from chain base index numbers into fixed base 
index numbers. 


Tllustration—12 


From the chain Base Index numbers given below prepare 
‘Fixed Base Index Numbers. 


Year 1955 1956 1957 1958 1959 
I. No 90 105 102 95 '98 
Chain Base | Fixed Base 
Year | Index No. Conversion Index No. 
АИ eS E EAA ЕДИ 
1955 90 — 90 
90 4105 94.5 
1956 105 100 
90 ороо 96.4 
1957 102 100 ^ 100 
90  105.. 102 
Ни уа ОБ 91.6 
1958 95 100 * 100 ^ 100 x 
90 . 105 , 102 . 95 : 
ххх. X98 92 
1959 98 100 ^100 100 100 X 


(iii) Base-Shifting—Sometimes it is desired to shift the 
base from one period to another. This is needed either because 
the previous base has become too old and is useless for 
comparison or comparison is to be made with another 
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series of index numbers having different base. Two methods 
are available for shifting the base. (i) Тһе relatives 
of each individual item may be reconstructed with a 
new base and thus an entirely new series may be formed. 
This is a lengthy process. (ii) A shorter method is to 
divide the indices of other years by the index of the 
year selected as base and multiplying the quotient by 100. 
The arithmetic mean introduces error in such method. This 
method is applicable without error when relatives have been 
averaged geometrically. When however actual prices are 
averaged or totalled, base-shifting presents no problem. Any 
one year may be chosen as base and the average or total prices of 
other years may be referred to in relation to that year. 


Illustration—13 


From the index numbers given below, find out index 
numbers by shifting base from 1952 to 1955. 


Year Index No. Index No. with base 1955 
1952 100 100x 197. =200 
1958 76 төх 199 —152 
1954 68 esx 1 =196 
1955 50 50X о, —100 
1956 60 sox 199. =120 
1957 70 Tox 190. —140 
1958 75 75X ЕУ =150 


(iv) Splicing—Sometimes the construction of an index 
number series is discontinued for the reason of its base becoming 
too old and has no practical utility. A new set of index numbers 
may be computed with some recent year as the base. At times 
it may be desired to connect the new set of index numbers with 
that of discontinued one. This brings the problem of splicing. 
This is done in this way :— 
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Year Old I. Nos. New І. Nos. Splicing Index No. 
Technique Spliced to old 
1914 100 — == oe 
1915 105 — c IL 
1916 108 — == == 
300 
1950 300 1 100х—— =800. 
00 Х 100 
300 
1951 1 1085X — — =815 
ni X 100 
300 
95 ——— ==82. 
1952 108 108 100 324 
300 
1958 112 112X тоо =886 
300 
1954 120 120Х 100 - =860 


(v) Deflating the Index Numbers—Index number indicates: 
the change in the money prices only. А series of money wages 
or income can be corrected for price changes to find out the 
level of real wages or income. This is done by deflating index 
numbers. А series thus deflated would give wages or income 
in real terms. 


^ 


Illustration—14 


The following table gives the annual income of a teacher and’ 
the general index number of prices during 1950-58. Prepare the 
index numbers to show the changes in the real income of the 
teacher. LP Y 


Income Price Real Real 
Year іп Rupees Index No. Conversion Income _ Income 
Index No.. 

1950 360 100 500100 =360 100 

1951 420 104 X 100 —404 112.2 
500 

1952 500 115 53357100 —434.5 120.7 
550 . 

1958 550 160 —1607* 100 —344 95.6 
600 

1954 600 280 —280 100 =214.3 59.5 
640 

1955 640 990: 4-5 561001. = 220.7 61.3° 


290 
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Illustration —14—Contd. 
Income Price Real Real 
Year in Rupees Index No. Conversion Income Income 
Index No. 
1956 . 680 300 080100 — —226.6 63.0 
J D è 300 — В n 
72 
1957 720 320 1295100 —=225 62.5 
750 
1958 750 330 76502105 927.3 63.2 


(vi) Index Number of unemployment. 


Jllustration—15 
Compute the index number of unemployment for 1950 using 
-1947 figures as the base :— 
Year Total Population No Unemployed 


1947 34X107 5X107 
1950 42X107 9X107 


Using Fisher’s formula— 


т. No for 1950— 4/2214, ХР! ү 
Y \уғ,а, * УР a 
_ q| X10) (34X107) — (950107) (42X10) > тоу 
6X10) (349107) * (55107) (42K 107) ' 
_ .[306 378 
im * 219 X19 
=1/3.24100 
=18х 100—180 
(vii) Quantity Index Numbers—An aggregate index number 
-of quantity (Physical volume) is the counterpart of the corres- 
ponding price index. Thus construction of a simple aggregative 
-quantity index would involve the formula— 


oes or 
9o 
Using base year prices as weights, the formula becomes— 
Ae Po 
Sx qoPo 


According to Fisher's formula Q— EX SP: 
(viii) The Index of Industrial Produciton—The index of 
industrial production seeks to measure the overall change in the 


SP ot „5ш 100 
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iotal volume of industrial production. During а given period 
there may be an increase in output of some of the industries 
while there may be shrinkage in some others. It would be very 
useful to measure the overall change in the volume of industrial 
production. This is possible by construction of index numbers 
of industrial production. Such indices are generally limited to 
production taking place іп secondary industries. Their 
compilation enables us to compare the rates of change of 
production in the various industries of an economy, and to 
compare these rates with changes in employment etc. 


The change can be studied by adopting following methods :— 
<) 
xRmWm 
sWm 
Where Rm is the production relative ie. the ratio of 
production during the compared month to that during the 
base period 


I.No— x100 


Wm is the weight allotted to the item 

(i) 
P,Q,— Pid 
E No—p eE 

P— Price of the output of an industry 

Q—Quantity of the output of an industry 

p—price of the input of an industry 

q—quantity of the input of an industry 

1—for current year 

0—for base year 


x100 


Such an index number will give change in the value of 


production. 


Gx) Labour Productivity Index—Indices of industrial 
oduction as a whole or of particular 


production, whether of pr : ч 
industries, are often used in conjunetion with employment series 


to measure labour productivity. 


The labour productivity of the base year will be 


3(5 Р,00—Х 009%) 100 
Lo 


—lL stands for employment figure of base year 
"The labour productivity of the current year will be 


28 . 
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x(xP)Qi— Уро) x100 
Lı 
L, stands for employment figure of current year. 
An index of productivity can be obtained by following 
method also 


Index of Industrial Production 
З = Index of Labour Employment 


Index of Labour Employment х100 


Limitations and Uses of Index Numbers. № will be 
apparent that an index number, whether it be of prices, of 
physical quantities or any other measure, is an arbitrary and 
imperfect measure. At best it will perform the task for which 
it is designed, if every. care has been taken to include the 
relevant constituent items and to weight them correctly. Index 
numbers exhibit the relative rather than absolute change. 
Different index numbers serve different purposes. Index numbers 
that are good for one purpose may be unsuitable for the other. 
The form of index number to use must depend upon circumstances 
e.g. the type of problem, extent and reliability of data еіс. The 
art of applying an index number, even for specific purpose for 
for which it was caleulated, is acquired only after careful study 
and considerable experience. Another limitation of index 
number arises from the wide range of variability of the data. 
For example an index of consumer’s prices extends over such a 
variety of commodities having different uses and subject to 
different influences that an average may not be significant. 

Index numbers reflect the movement of some phenomena 
over time or place. Hence it is a good measure of comparison. 
Different types of Index numbers have different uses. Wholesale 
price index number tells us the changes taking place in the value 
of money. Cost of living index number indicates about the 
changes in the real income of the people. It can help in adjusting 
the wage-level so that real wage-level may not decrease. 
Investment index number is of great use to stock-exchange 
speculators. There is now developing a tendency of restatement 
of Economics in Statistical form, which promises to be an 
important contribution to social sciences. For such study index 
numbers are of great use as tools for analysis of economic 
fluctuations. i 


n 
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Theoretical Questions 


1—'Index Numbers are devices for measuring differences in 
the magnitude of a group of related variables.’ Elucidate. Also 
discuss the important uses of Index Numbers. 


(M. Com. Raj. 1956) 

2—Define an ‘Index Number’. Distinguish between the 

Fixed Base and Chain Base Methods of constructing index numbers 
and discuss their relative merits. 

(B. Com. B. H. U. 1957) 


3—'Index Numbers are economic barometers’. Explain 
this statement and mention what precautions should be taken in 

making use of published index numbers. 
(B. Com. Alld. 1952) 


4—Define an index number. Explain the role of weights 
in the construction of an index of a general price level. 
(M. А. Raj. 1950) 


5—What points would you take into consideration in choosing 
the base and determining the weights in the preparation of cost 
of living index numbers. (B. Com. Agra 1948) 

6—Discuss the Ideal Formula for preparing index numbers 
given by Fisher. 


7—You are required to construct a cost of living index for 
textile workers of a city. Indicate what information would you 
collect for the purpose and explain the method of constructing 
the index. (M. Com. Vikram 1960) 


S—Examine the concept of ‘Economic Activity Index’. 
Describe in general terms the method of constructing such an index. 
(M. Com. Luck.) 


9— Give a brief description of various types of Index 
Numbers. Explain in detail the method of constructing апу two 
of them. (B. Com. Madras) 


10. Expound the various methods of ‘weighting’ and discuss 
their suitability for calculating a weighted index number of Prices 
in India. (M.A. Calcutta) 


11— What is meant by the reversibility of an index number ? 
What index numbers are reversible ? (B. Com. Luck.) 


12-—The real problem for the maker of index numbers is 
whether he shall leave weighting to chance or seek to rationalise 
it^ (Mitchell) Distinguish clearly between chance weighting and 
rational weighting and suggest a solution of the above problem. 
Also discuss whether Fisher's ideal formula offers a rational 
system of weighting.” (M. Com. Alld.) 


18—“The discussion of the proper weights to be used has 
occupied a space in statistical literature out of all proportions 
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to its significance. For it may be said that no great importance 
need be attached to the special choice of weights ; one of the 
most convenient facts of statistical theory is that given certain 
conditions the same result is obtained with sufficient closeness what- 
ever logical system of weights is applied." (Bowley) Discuss the 


above statement. (M. Com. Alld.) 
14. Explain the use of Index Numbers for the study of price 
changes in statistical analysis. (І.С.5.) 


15. "Index Numbers аге a series of numbers by which | 


changes in the magnitude of a phenomenon are measured from time 
to time or from place to place." (Secrist) Explain. 


16. What is a cost of living index Number? What 
purpose does it serve ? How will you construct a cost of Living 
Index Number for the working class population of your city ? 


` 


Practical Questions 


1. Prepare index numbers of prices for three years with the 
average price as base :— 


Rate per rupee 


wheat cotton oil 
Ist year 10 seers 4 seers 3 seers 
2nd year of 82, 8.5 
9rd year ӨӨ gius 


25 22 
(B. Com. Agra) 


(Ans. I. No. for Ist year 90.97, 2nd year 98.1, 3rd year 109.9) 
2— Prepare index number of prices for three years with the 
average prices as base :— 


Rate per rupee (in seers) 


Wheat Cotton Oil 
Ist year 4 2 2 
2nd year 8 115 144 
8rd year 215 1 3 


А 
(В. Com. Saugar) 


(Ans. I. No. for Ist year 67.5, 2nd year 95.0 and 8rd year 187.5) 

8—From the information given below, prepare cost of living 
index numbers for 1958 and 1959 taking the average prices of 
1957 as base :— 


Group of Articles 1957 1958 1959 
Rs. Rs. Rs. 
1. Food per md. 20 24 21 
2. Clothing per yd. и 1=50 1 
3. Rent per room 5 8 8 


4. Miscellaneous 2 2—95 2—19 
Give weights to the four groups as 4, 3, 2, 1 respectively. 
(B. Com. Agra) 
(Ans. I. No. for 1958—124 and for 1959 —109.8 by Agg. Method) 


(Vikram M. Com.) _ 2 
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4. Compute the index number of unemployment for 1960 
using 1957 as the base :— 


Year Total Population No. Unemployed 
1957 34X 107 5x107 
1960 42X10" 9X107 


(Unemployment Index for 1960—180) 


5. From the following data, calculate a price index for the 
year 1958 by using simple geometric mean :— 


1950 ( Base year) (1958) 
Commodity Average Price Average Price 

A 16.1 14.2 
B 9.2 8.7 
с 15.1 12.5 
р 5.6 4.8 
Е 117 18.4 
Е 100.0 117.0 


Now reverse the process taking 1958 as base year and 1950 
as current year and show that two results are strictly consistent. 


(B. Com. B.H.U.) 
(I. No for 1958—104.0 and I. No for 1950=96.2) 
6. The following are the group index numbers and the group 


weights of an average working class family’s budget. Construct 
the cost of living Index Number by assigning the given weights :— 


Group Index No for 1960 Weights 
Food 852 48 
Fuel & lighting 220 10 
Clothing s 230 8 
House Rent 160 12 
Miscellaneous 190 15 


(Index Number=276.4) 
(B. Com. B.H.U., Vikram & I.A.S.) 


7. Ап average family of industrial workers in a town 
consumed during August 1949, 1.5 maunds of food-grains, 10 yards 
of cloth, 2 maunds of fuel and 1 tin of Kerosene oil and paid 
Rs. 15/- as house rent. Food grains then sold at an average price 
of Rs. 6/- per maund, cloth at 50N.P. per yard and fuel at 
Rs. 2—25 М.Р. per maund, while a tin of Kerosene oil at Rs. 5/-. 
By August 1959, the average prices of food-grains and cloth had 
risen to 3 times and 23 times the 1949 level respectively, fuel rose 
to Rs. 5/- рег maund and house rent to Rs. 20/-. The solitary 
exception was Kerosene oil whose price fell by 50 Np. per Tin. 

Express in quantitative terms the rise that took place in the 
cost of living of industrial workers in August 1959 as compared 
with August 1949, making clear your method of approach. 

К (М. Com. Agra) 
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(Cost of Living Index No for August 1959—192.2) 


8. Explain Fishers Ideal Formula for preparing Index 
Number. What are Time Reversal and Factor Reversal Tests ? 
Prepare Index Number for 1960 on the basis of 1950, where the 
following information is given :— 


Article T Article II Article III 
Year Price Quantity Price Qnantity Price Quantity 
1950 5 10 8 6 6 3 
1960 4 12 7 7 5 4 


(M. Com. Agra) 


(Fisher's Ideal Index No for 1960=83.6) 
9. Construct with the help of data given below, Fisher's 
Ideal Index and show how it satisfies the factor reversal test :— 


Produce in thousand Harvest Price per md. 
tons in district Saran in district Saran 
1959-60 1960-61 1959-60 1960-61 
Rice 17 26 $—50 | 3—12 
Barley 107 83 2—00 1-87 
Maize 62 48 2—56 S75 
(M. A. Patna) 


(Fisher's Ideal Index for 1960-61—84) 
10. Given the following data what index numbers would you 
use for purpose of comparison. Give reasons. 


RICE WHEAT JOWAR 
Year Price Quantity Price Quantity Price Quantity 
1957 9.3 100 6.4 11 5.1 5 
1960 4.5 90 87 10 2.7 3 


(M. A. Calcutta) 
(Fisher's Ideal Index for 1960—49.1) 


11. Compute an appropriate index number for purposes of 
comparison from the following data :— 


эы оси 


Е1СЕ Wheat Jowar 
Year | ——————————— 
Price | Quantity | Price | Quantity | Price | Quantity 
1940 | 4. | № 10 2 | 5 
1959 10 | 40 8 8 4 4 
(LA.S.). 


(Fishers Ideal Index for 1959—250) 
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12. Construct the cost of Living Index Number for 1960 on 
the basis of 1955 from the following data using (i) Aggregate 
Expenditure Method and (ii) Family Budget Method :— 


Articles Qty.Consumed Unit Prices in Price in 
in 1955 1955 1960 
Rs. Rs. 
Wheat 6 Maunds Md. 10—00 16—00 
Rice PIT 3 15=00 20—00 
Gram TN » 6—00 12—00 
Arhar 8^ 35 5 8=00 12—00 
Ghee 6 Seers Sr. 3—00 5—00 
Gur 2 Maunds Md. 5—00 10— 
Salt 16 Seers » 6—00 9—00 
Oil 5065 Sr. 1=25 2—50 
Clothing 50 Yards Yd. 0—50 0—62 
Firewood 8 Maunds Md. 0—75 1—25 
Kerosene 1 Tin Tin 3—50 7—00 
House Rent Опе House Н. 10—00 15—00 


(I. No. by Ag. Expdr. Method and Family Budget Method 
—154.06) 
13—Use the following data of industrial production in India 
to compare the annual fluctuations in Indian industrial activity by 
the chain base method— 
Index Numbers of Industrial Production in India 


Year Index No. Year Index No. 
1989—40 120 47 149 
41 122 48 156 
42 116 49 137 
43 120 50 162 
44 120 51 149 
45 187 52 160 
46 136 53 160 


(Luck. M. Com.) 
Chain Indices— 
(Ans. 100, 101.66, 95.08, 103.4, 100, 114.16, 99.27, 109.6, 
104.7, 87.8, 118.2, 91.9, 107.4, 100) 
14—An enquiry into the budgets of the middle class families 
in a city in England gave the following information :— 


Food Rent | Clothing Fuel "Misc. 
Expenses on |0359) | (15%) | (20%) |(10%)| (20%) 


£ £ £ £ £ 
Prices (1959) 150 30 75 25 40 


Prices (1960) 145 30 | 65 23 | 45 


What changes in cost of living figures of 1960 as compared 
with that of 1959 are seen. (Luck. B. Com.) 
(Ans. 98.15) 
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15. Construct approximate index number to discuss the 
fluctuation in the export of raw cotton from India for the period 
1980—31 to 1935—36 using the average of the period 1926—30 
as base. 


Year Quantity of raw Value of raw cotton 
cotton in thousand in Lakh Rupees 
Tons 
(Base) 
1926—30 609 5941 
1930—31 701 4633 
1931—32 423 2345 
1932—33 365 2037 
1933—34 504 2753 
1934—35 623 8495 
1935—-36 607 8877 
(I.C.S.) 


Answer. Qty—115.1, 69.4, 60, 83, 102, 98 
Value— 78.1, 40,1 34, 46, 59, 57 


16. Show with the help of the following data that the factor f 
reversal test is satisfied by Fishers Ideal formula for Index | 
Number construction— | 4 


Commodity Base Year Current Year 
Price—Quantity Price—Quantity 
A 6 50 10 56 
B 2 100 2 120 
C 4 60 6 60 
D 10 30 12 24 
E 8 40 12 86 
(Alld M. Com. Punjab M. A. Delhi B. Com.) 
1880 


Ans. (Ри Хм) = and Index No—190 


1860 

17. A certain fictitious index number of wholesale prices based 
on the simple arithmetic mean of price relatives, comprises 40 items. 
"These are divided into 7 groups. А separate index is published 
for each group. Find the index number for all the groups 
combined for 1960 from the following data— н 


Стопр А В С а Е 
No. of items in group 10 5 8 4 8 4 6 
Group Index for 1960 120 95 115 142. 86 100 105 


Also construct combined index number with the help of 
geometric mean. 


.Ans. Г. No. by Arith. Average 111.8 
I. No. by G. M. 110.7 
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18. From the fixed base index numbers given below, prepare 
chain base index numbers— 


1955 1956 1957 1958 1959 1960 
94 98 102 95 98 100 
(B. Com. Agra) 
(Chain Indices—94, 97.8, 101.7, 94.6, 97.4, 99.3) 


19. From the chain index (Base 1952) given below find out 
chain index by shifting the base to 1955. 


Year 1952 1953 1954 1955 1956 1957 
Chain Index 100 76 68 50 48 45 


(Chain Base indices with 1955 base—200, 152, 136, 100, 96, 90) 


20. Construct the cost of living index for April 1960 from 
the following data— 


Groups Weights Group Index No for- 
April 1960 

Food 47 247 

Fuel and lighting у 298 

Clothing 8 289 

House Rent 13 100 

Misc. 14 


236 
(B. Com. Alld.) 
(Cost of Living Index No=231) 
21. 'The annual wages of a worker in rupees along with price 
index numbers are given below. Prepare index number for real 
wages of the worker. 


Year Wages Price Index No. 
1949 200 100 
1952 240 160 
1953 850 280 
1954 860 290 
| 1955 360 300 
1956 370 320 
1957 375 330 
(В. Сот. Аста) 


(Real Wage Т. No.—100, 75, 62.5, 62, 60, 58, 56.8) | 
22. The following gives the annual income of a teacher and 
the general index number of prices during the last nine years. 


Year Income (Rs.) General Index no of prices 
1952 360 100 
1953 420 104 
1954 \ 500 115 
1955 550 160 
1956 600 280 
1957 640 Р 290 
1958 680 300 
| 1959 ` 720 320 


1960 ` 750 = 880 
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Prepare an index number to show the changes in real income 
-of the teacher and discuss the effects of a rise in the general level 
-of prices on his real income. 


(M. A. Agra) 
(Index No of Income—100.0, 112.2, 120.7, 95.6, 59.5, 61.3, 
68, 62.5, 63.2) " 


23. Calculate the index number of prices for 1960 on the 
basis of 1959 from the data given below :— 


‘Commodities Weights Price Per Unit Price Per Unit 
1959 > 1960 
Ез. Ез. 
А 40 16—00 20—00 
B 25 40—00 60—00 
С 5 0—50 0—50 
D 20 5—12 6=25 
E 10 2—00 1—50 
) 


(M.S.W. Lucknow) 
(Index No for 1960—124.4) 


24. Prepare index numbers from the average prices from the 
‘three groups of articles given below in rupees per unit. 


Group 1958 1959 1960 1961 
T 15 18 24 30 
IT 9 12 15 18 
TII 2 2 3 3 


Give weights to the three groups as 4, 3 and 2 respectives. 
(B. Com. Lucknow) 


(Weighted Index. 100, 120, 160, 189) 


25—Construct the cost of living Index from the following 
‘data :— 


Prices 

Ttems of consumption Qty consumed 1950 1956 Unit 
Food 30 втв. 13=12 16=75 per md. 
Clothing 15 yds. 1—00 . 1—25 per yd. 

Education 1 child 45—00  60—00 
Fuel etc. 15 srs. 3=00 4=25 рег md. 
Misc. 10 units 0—75 0=62 per unit 
(M. Sc. Agra) 


(Ans. I. No.—125.6) 
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26—Given below are two sets of indices one with 1939 as 
base and the other with 1947 as base :— 


(A) 


Year Index No. 


1989 
1940 
1941 
1942 
1943 
1944 
1945 
1946 
1947 


400 


Year 
1947 
1948 
1949 
1950 
1951 
1952 
1953 
1954 


(B) 


Index No. 
100 
110 

90 
98 
101 
110 
98 
96 


You are to prepare a combined series with 1939 as base. 


(Ans. 1947—400 
1948—440 
1949—360 
1950—392 
1951—404 
1952—440 
1958=392 
1954—3884) 


: 27—'ТҺе following are the group index numbers and the 
group weights for the Ahmedabad workers for the month of J une 
1952. Construct the cost of living index number for the given 


month. 


Groups 
Food 


Fuel and Lighting 


Clothing 
Rent 


Miscellaneous 


Ans. 269.55) a 
d. the data given below, construct the cost of Living 


Index Number. 


Food 
Rent 
Clothing 


Fuel and Lighting 


Miscellaneous 


(Ans. 258.5) 


Index No. 
277 
283 
322 
107 
335 


Price Relatives 
250 
150 
320 
190 
300 


(M. Com. Vikram) 


Weights 
58 


4 
(Gujrat B. Com.) 


Weights 
45 
15 
20 
5 
15 
(Gujrat В. Com.) 
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29—From the following index numbers prepare new ones by 
(a) taking the prices of the year 1942 as the base and (b) using 
chain in base system. 


Year 1939 1940 1941 1942 1948 1944 
Index Number 100 110 175 250 300 400 

: (M. Com. Agra) 
Chain base I. No.—100, 110, 159, 143, 120, 133) 


(Ans. І. No. with 1942 as base—40, 44, 70, 100, 120, 160 


30— Construct Index Numbers for the year 1904 on the base 
of the year 1902 of the following :— 


| Article I Article II Article TIT 
Year |— —— ——— —— 

| Price Qty. | Price | Qty. Price | Qty. 
1902 | 5 10 8 6 8 3 

| | 
1904 | 4 12 7 "e| Bout. 4 


: (M. Com. Agra): 
(Ans. I. No. 83 approx.) 


СНАРТЕЕ 15 


INTERPOLATION 


Meaning. Interpolation is a statistical method of estimating 
the most likely figure of a dependent variable for a given subject 
variable. If two variables x and y are given simultaneously and 
if we are required to estimate the probable value of y for given 
x, the value will be found out by using the technique of inter- 
polation. For example, if the population of a city is given for 
1901, 1911, 1921, 1931, 1941 and 1951 and it is required to 
estimate the population of say 1925 or 1935 it is called inter- 
polation. However if population for 1961 is to be estimated, 
it is called extrapolation, because 1961 is outside the maximum 
given value of x. Interpolation supplies us with the missing 
link while extrapolation also called projection helps us in 
forecasting. This technique of interpolation is of very great 
value in Statistical Methods. The Median and Mode for а 
grouped data are computed by a simple process of interpolation. 
This technique also enables us to make best estimate of the 
missing figure in statistical data. ` 


Necessity. It is not possible always either due to financial 
or other causes to collect figures for each and every observation, 
but nevertheless need may arise when some intermediate figures 
may be required. In such cases the only way is to resort to the 
technique of interpolation. Some kind of data are collected at 
regular intervals, fer example census of population in India takes 
place every tenth year. If it is required to estimate the popula- 
tion of the country for a year which is not a census year, the 
interpolation is the only way out. Interpolation may also be 
made in ease data are insufficient or have been destroyed or lost. 
Again sometimes it is found that the data pertaining to a 
particular phenomena are grouped by different agencies in 
different types of groups and this makes them unfit for 
comparison. For example, a municipal corporation may group 
its population by making age groups of fifteen years while the 
other may have them of ten years. In such à case comparison 
of say, death rate, cannot be made. But this is made possible 
with the help of interpolation. 
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Assumptions. The statistical technique of interpolation 
and so also of extrapolation is based on two assumptions :— 


(a) There is no violent or disturbing situation in the 
intervening period. In other words there is general orderliness 
in the data. They depict some sort of continuity and are not 
marked by sudden jumps from one period to the other. and 


(b) There is uniformity in the changes of known figures. It 
means that there is regularity in fluctuations and the rise and 
fall is uniform. 


Accuracy. When a certain figure has been interpolated the 
problem is how far it is accurate? In the words of 
Dr. A. L. Bowley the accuracy of interpolation depends (i) “on 
knowledge of the possible fluctuations of the figures to be 
obtained by a general inspection of the fluctuations of dates for 
which they are given. (ii) On knowledge of the course of events 
with which the figures are connected." Тї is to be noted that 
interpolation is based on hypothetical conditions, which might be 
different from the actual circumstances. In some cases these 
figures are not in existence at all. It is better if interpolation is 
made within certain limits and its method is mentioned. 


Methods of Interpolation. *The methods of interpolation 
are— 
r 1. Graphic Method. 
2. Algebraie Methods. 
The algebraie methods may be— 
(1) Binomial Expansion Method. 
(2) Parabolie Curve Method. 
(3) Newton's Method of Finite or Advancing Differences. 
(4) Newton-Gauss (Forward) formula. 
(5) Sterling's Formula. 
(6) Newton-Gauss (Backward) formula. 
(7) Lagrange's method. 

Graphic Method of Interpolation. According to Neiswanger 
“when two known points are joined by a line for the purpose of 
estimating intermediate values, the process is called 
interpolation.” This is the simplest method of interpolation, 
but its results are not very much reliable. According to this 
method available data are plotted on a graph paper by taking x 
variables on x—axis and y variables on y—axis. The various 
points are joined so as to form a curve. From the point on x 
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axis for whieh the value of y is to be interpolated a line parallel 
to y axis will be drawn. From the point where this line will cut 
the curve, а line parallel to x—axis will be drawn. . The point 
where this line cuts y—axis value of y will be found out. 


Illustration—1 


Estimate the expectation of life at age 22 & 40 from the- 
following data. 


Age in years Expectation of 
ў life in years 

35.4 

15 32.2 

20 29.1 

25 26.0 

30 23.1 

35 20.4 
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Algebraic Methods of Interpolation 


1—Binomial Expansion Method—This method is applicable 
when the independent variable x advances by equal increment 
and the value to be interpolated is one of the class limits of x 
series. The formula is based on the expansion of (y—1) *—0 
This formula is expanded like 


x(x—1) |... 1) (x—2 e 
y^-xy^!4 “т 2 _х@« EXP ум пе 3 
х(х—1) (x—2) (x—3) „4 
1x2Xx3Xx4 y 
_ х(х—1) (x—2) (x—3) (x—4) ye 
Ч и К a eto 


- Ilustration—2 


Using any interpolation method other than graphical, find the 
ilikely number fer 1953 from the following table :— 


Year Index No. 

1951 100 

1952 107 

1953 == 

1954 157 

1955 212 

(M. Сот., B.H.U.) 
Year Index No. 
X n Y 
1951 100 yo 
1952 107 yı 
1953 == ya 
1954 157 ys 
1955 212 y4 
(y—1)4=0 


“Оп expanding it— 


4(4—1) 4(4—1) (4—2 
scat yer рва 


4(4—1) (4—2) (4—3) 54-0 
12х84 


4-3 


Or 
yi—Ay3-L6y?—Ay1-|-yo—0 
"Ву substituting values we get 
212— (4157) --6y?— (4107) +100—0 
—212-628-1 Gy2—428-+ 100—0 
—6y?—— 212-1 628-4 428—100 
=6y2—744 or y?—124 
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There is another method of finding coefficients of Binomial 
expansion. It begins with the coefficient as 1 which is not 
mentioned. The figure y is accompanied by the number the 
equivalent of which we have to find by the binomial expansion. 
Thus if we have to expand the equation to 5 powers, then it will 
begin with уз. The second place will be occupied by 5y4, the 
coefficient 5 is arrived ай by multiplying the coefficient of y and 
figure accompaing y of the first place and dividing the product 


by rank of the" plate e 255. -—5. "The third place will be 


occupied by 10y?. The coefficient 10 is arrived at by multiplying 
the coefficient of previous 'y' with the figure accompanying it and 


Av . 4x5 
dividing the product by the rank of the place i.e. $ =10 


The fourth place will be occupied by 10у? the coefficient is the 
result of multiplication of the coefficient of previous y with the 


figure accompanying it and dividing it by the rank of the place i.e. 


10x3 10x2 
E ue. The next coefficient will be шык апа пехї 
5x1 


5 =l Thus we go on finding the coefficient till the figure 
accompanying y becomes 0. The-+-and— signs appear alterna- 
tively, beginning with--and the figure accompanying y 
diminishes by 1 as we proceed. 
Ilustration—3 

The age of mothers and the average number of children 
born per mother are given in the table below. Interpolate the 
average number of children born per mother aged 30-34. 


Age of mother in years No. of children 
16—19 0.7 
20—24 2.1 
25—29 3.5 

' 80—84 ? 
85—89 en 5.7 
40—44 - уз вв 


(M. Com., АП.) 
29 
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X E 
15—19 0.7 yo 
20—24 2.1 yı 
25—29 3.5 ys 
30—34 ? ys 
35—39 5.7 ya 
40—44 5.8 ys 
(y—1)5=0 


y5— by*--10y3—10y?-L.5y1 —yo—0 
By substituting values we get— 
5.8— (55.7) -10y?— (10x(3.5) -I- (52.1) —0.7—0 
—85.8— 28.5-L-10y3—35-1-10.5—0.7—0 
—10y3— —5.8-1-28.5-1-35—10.5-1-0.7 
—10y3—47.9 or y?—4.79 


Thus most probable No of children for 30-34 age group is 
=4Л9 


Ilustration—4 


The following table gives the population of a town at the 
time of last six censuses. 


1901 75,401 
1911 82,984 
1921 86,686 
1931 44,947 
1941 93,091 
1951 1,27,327 


Estimate the population for 1961. 


Note :—In the above data the population is continuously 
increasing from 1901 to 1921. But there is sudden decline in 
1931 and again a sudden rise in 1941. Before estimating the 
population for 1961, it is necessary to find out population for 1931 
taking into consideration uniform rate of increase. 


№ 


Interpolation for 1931 


Year Population 
X 1 
1901 75,401 yo 
1911 82,984 y: 
1921 86,686 Yo 
1981 ? ys 
1941 93,091 ya 
1951 1,27,327 yj 
(y—1)5=0 


y5—5y*-I-10y3—10y?--5y1 —y9—0 
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By substituting values we get— 

1,271,327 — (55«93,091) --10y3— (105«86,686) -+ (5Х 82,984) 
—75,401—0 

—1,21,321—4,65,455-|-10y?—8,66,860-1-4,14,920— 75,401—0 


or 10y3— —1,27,321-1-4,65,455-1-8,66,860—4,14,920-1-75,401 
or 10y?—865469 
or  y?—86546.9 
or —B86547 
By keeping this figure for 1931, we interpolate for 1961. 
Year Population 
1901 75,401 yo 
1911 82,984 y 
1921 86,686 ys 
1931 86,547 ys 
1941 93,091 ys 
1951 1,27,927 ys 
1961 КЕ Ув 
(y—1)9—0 


y'— 6y5-1-15y*— 20y3-|-15y2—6y1-|-y?—0 

By substituting the values. 

y9— (6><1,27,327) + (155«98,091) — (205«86,547) + (15 X 86,686) 
— (6X 82,984) +75401—0 

у‘—7,63,962-1 13,96,365—17,30,940--13,00,290—4,97,904 
+75401—0 

y°—7,63,962—13,96,365+-17,30,940—13,00,290-++-4,97,904—75401 
=2,20,750 

Estimated Population for 1961—2,20,750 

It shows a sudden jump. It would be better if figure for 
1951 is also interpolated to bring uniform changes in the series. 


Illustration—5 


Interpolate the missing figures in the following table of rice 
cultivation :— 
Acres in millions 


Year 

1951 76.6 yo 
1952 78.7 yi 
1953 n ya 
1954 777 уз 
1955 78.7 ya 
1956 s ys 
1957 80.6 yo 
1958 77.6 ут 
1959 78.7 ys 


(y—1)8=0 (y—1)*=0 
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(Note—As two values are to be interpolated, last item 
will be left in one equation, so that values of both may be 
caleulated) 


—y$—8y-|-28y9—56y5-L70y*—56y?-1-28y? — 8y! + у0—0 
Ly5—y5--21y5—35y--35y3—21y?-L'7y1 —y9—0 

By substituting values we get— 

78.7— (877.6) + (2880.6) —56y5-1- (7078.7) — 

(56 X77.7) --28y?— (878.7) +76.6=0 

77.6— (780.6) -21y9— (3578.7) + (3501.1) 
—21y2-L (778.7) —16.6—0 


taking first 2 
—56y5-L-28y?-1-7921.1—5601.6—0 «x 

or—56y5--28y2—.—2319.5 (i) 

taking 2nd— 


21y5—21y?-L3348—3395.3—0 
or 21y5—21y?— 47.3 (ii) 
84y5—84y?— 189.2 . (iix4) 
—168y°+-84y2—— 2319.5 Cix3) 


— 84у5—=— 6769.3 (Added) 
6769.3 
mad ax, 
== 84 —80.6 
Substituting the value of y5 in (її) 


1692.6—21у2—47.3 


—21у?—47.3—1692.6 
=  —21у2=— 1692.6 
. 1692.6 
= 21 =78.3 


The average in 1953—78.3 millions 
1956—80.6 millions 

2—Parabolic Curve Method—This method of interpolation is 
applicable in all cases. The equation of this curve is 

y=a-+-bx- cx? dx3ext ---- nx” 

The order of the parabola depends upon the number of 
known items in the series. It is one less than the number of 
known items. In this equations a, b, с, d, е ete are constants. 
The parabola so determined is the best. It passes through all 
points. This method is also known as ‘method of simultaneous 
equations. : deos 


Illustration—6 


The following figures give the population. of Agra at | 
decenvial censuses. Interpolate for 1946. 
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Year Population in Lakhs 
1931 172 

1941 171.7 

1951 157.2 

1961 183.9 


As the known items are four, the parabala of n—1 or 
4—3—8rd order will be fitted 


y—a--bx--cx?--dx* 
x stands for deviation of x variables from the year of 


interpolation. 
Year г от 
1981 —15 —3 172 
1941 — 6 —1 171.7 
1946 0 0 уо 
1951 5 +1 157.2 
1961 +15 +3 183.9 


By substituting the values in equations we get 
y—a--bx-Lex?--dx? 


When x——3 

172—a—3b-49e—27d (1) 
x=—1=171.7=a—b+c—d (2) 
x—0—y0—a (3) 
x—-Li1—157.2— a+b+te+d (4) 


x—-L3—183.9—a--3b--9c--27d (5) 
If with the help of these equations value of ‘a’ is known it 
will be the desired value of y?. 
Adding equations (2 and 4) 
171.7=a—b-+c—d 
157.2=a+b+te+d 


ee 

328.9—22-|-2c (6) 

Adding equations (1 and 5) 
172—a-——8b-1-9c—2'7d 

183.9—а4-3--9е--274 

355.9—2а--18с (7) ; 
2960.1—18a-+ 18¢ (Equation 6X9) (8) 
355.9— 2a+18¢ (deduct 7) 
2604.2—16a 

162.763=a 


Hence required figure for 1946—162.763 millions. 
If figures to be interpolated are more than one this becomes 


a lengthy process. For this а trend equation is found out and 
then values are interpolated. For this deviations are calculated 
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from any origin in x and y variables and with the help of 
parabolic curve equation, values of eonstants are calculated. 
Then this trend equation is found out 


y—B=a+b(X—A) +e(X—A)?1+d(X—A)34 ..... 
Here B=origin in y variables. 
A=origin in x variables. 
X=The value of x for which value of y is to be 
interpolated. 
Illustration —7 


From the following life table, caleulate the number living at 
ages 25, 35 and 47. 


Age (in years) Number living 
20 51 
30 44 
40 35 
50 24 
(M. A. Alld.) 
x 20 30 40 50 
y 51 44 35 24 


A=40 (origin of x variable) 


B—35 (origin of y variable) 
x —20 —10 0 +10 
y 16 9 0 —11 


As there are four known items the parabola of 4—1—8rd 
order will be fitted с 


y-a--bx-Fex?-L-dx8 
Since сигуе passes through (0,0) а—0 
Substituting for x and y we get 


16=—20b-+-400c—8000d (1) 
9—=—10b-+-100e—1000d (2) 
—11= 10b+100c+1000d (3) 


Add (2 and 3) 
9=—10b-+-100e—1000d 
—11=  10b-L100c-L-1000d 


— 2— 200c 


or е ——.01 


By substituting the value of c in equation No 1. 
16——20b—4—8000d 
or 20——20b—8000d 


or 20b+-8000d—=—20 (4) 
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By Substituting the value of с in equation No 2. 
9——10b—1—1000d 
or 10— —10b—1000d 


or 10b+-1000d——10 (5) 
Dividing 4th by 20 and 5th by 10 we get 
b+400d=—1 (6) 
b+-100d—=—1 (7) 


45--4004=—4 — (Tthx4) 
b+400d=—1 (Deduct 6th) 


3b =—8 
b =—1 
Substituting the value of b in 6th 
b-I-400d8——1 
—14400d=—1 
400d—0 
d—0 
Values of constants are — a—0 
к=—=1 
с=—.01 
d—0 


Trend Equation is— 
y—B—a-Lb(x—A)--e(x—A)?--d(x—A)? 
Substituting values we get 
y—35—0—1(x—40) —.01(x—40)2-1-0(x—40)3 
Interpolation for 25 
y—35—0.—1(25—40) —.01(25—40)?-1-0(25—40)3 
or y —35—15— 2.25 
огу—854-15—2.25 —47.75 
Interpolation for 35 
y—35—0—1(35—40) —.01(35—40)2-1-0(25—40)3 
у—35—5—.25 
y—85-L5—.25 =89.75 
Interpolation for 47 
y—35—0—1(47—40) —.01(47—40)2-1-0(47—40)3 
or y—35— —'1—.49 
у=35—7—.49 —27.51 
Number living at age 25—47.75 
35—39.75 
47—27.51 
3— Newton's Method—This method is applicable when 
variation in x is by equal increments. Newton's formula. gives 
best results when interpolation is to be done near the beginning 


of the series. The formula is 


—1 x(x—1) (x—2) 
ye=VotxAto+ хар А тузу Д0 
х(х 1) (х9) 08-8) ng 
ас 


er oy 


sep 
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Where y, is the fi ure to be interpolated 
yo is the value of the year of origin Г 
A are the differences between the adjoining values 
of years. 
year of Interpolation—year of origin 
— Time distance between adjoining years 


Differences are calculated as shown below :— 


TABLE SHOWING Finite OR ÅDVANCING DIFFERENCES 


Finite or Advancing Differences, 


ху 


First Second Third Fourth 
Differences Differences Differences Differences 
1 A? A At 


20 |yo 
Vi yi [yi— Yo АК Ў 
92 02 02—01 [Ah LAT ЛІ, | A2, 
98 ys 08—52 ЛЛІ — A1, AP |2. — Д2, LASS 

24 lya lya—ys ДД — AL, | A2, А%— Лэ, | A,[A3, — ДЗ, 


The differences should 
difference is calculated 
wrong, 


be calculated very carefully. If 
wrongly, all other differences will be 


The following table gives the population of a State in India 
Find out the population for 1936 :— 


Year Population in Lakhs 

1911 . 120 

1921 128 

1931 139 

1941 153 

1951 168 . 
(B. Com. B.H.U.) 

moche ы и уге р ж 


мые мше... Д 

Year |Роршайоп n 

(2) in lakhs IX x MEM epe eS = 

(y) At A? As AS 

1911 | zo | 120 | уо 

1921 ач | 128 | y] 8 Ato 

1931 | 22 | 189 | ya | 11} A | 8 | A2; 

1941 | аз | 158 | ys | 14] At} 3 A, 0| A35 4 

1951 | аз | 168. | 941.151 ДЪ 1| ДЗ | в Да, 19до 
М 


Finite or Advancing Differences 
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_ year of Interpolation—year of origin 
= Difference between the adjoining values of x — 
1986—1911 25 
—— Jac PME AAT 
The formula is 


af А МЕ 
Ya— Yo+xAt%o+ E До ал SY Ate 
x(x—1) (x—2) (x—8) 
1X2X3Xx4 
By substituting values we get 


T =120425x84 28081) KEA T 


2.5(2.5—1) (2.5—2) (2.5— 
Ааа Пл аук аы ы 


у«=120--20.0--5.625-0-1-0.078125 
—145.703125 
or 145.7 Lakhs. 


Interpolation in Frequency Distributions 


Д% 


In a frequency distribution, interpolation is done after 
cumulating the frequencies. 


Illustration—9 
Estimate the number of persons whose incomes are between 
Rs. 400 and Rs. 500 from the following figures :— 


Income in rupees No. of persons in thousands 


Below 200 120 
200— 400 145 
400— 600 200 
600— 800 250 
800—1000 150 
(М.А. Agra) 
No. of ; : 
Income |persons in Finite or Advancing differences 
in rupees |thousands|— — — т 
(a) (у) At AS AS AS 
Below 


» 200| zo| 120 | yo 
» 400 | 265 | y1| 145 | Alo 

„ 600| 21 445 | y2| 200 | Ah | 55 Ao 

» 800] з| 718 | 43| 250 | Aa | БОЛ? — 5 A, 

› 1,000] x4 | 865 | ул | 150 | As |—100. A? 1—150 ДЗ, |—145/ 4 
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x to be interpolated—x at origin _ 500—200 — 
bur Time difference. ED 200 


р cx —2 
ye Yo xA 7 А®-- A era iS Улан 


1.5 


уе =; 


By substituting values we get 


1.5(1.5—1) 1.5(1.5—1) (1.5—2) 
Е 


Ya=120-41.5 145. "554. Х—5 


1.5(1.5—1) (1.5—2) (1:5—8) 
is 24 


—120-1-217.5--20.625-1-0.3125 —3.3984375 
—355.0890625 
=855 thousands approx. 
Persons having income below Rs. 500 А ==855 thousands 
Persons having income below Rs. 400 =265 


Х—145 


» 


Persons having income between 400—500  — 90 


» 


38—Newton’s Formula for halving а group—Sometimes, 
particularly in problems of population, where data are classified 
in ten year age-groups, it is required to find the number of 
people in two five year age groups of a composite ten year age 
group. Newton has developed a simple second-difference formula 
for this purpose. The formula requires the number of people 
in the ten year age groups preceeding and succeeding the group 
which has to be broken into two five year age groups (f,a and 
fib). It is | 


а=} (£i 8 (fo—fo)} 
The other part f;b will be f,b=f,—f,a 
Where— 
fy—is the number in the preceeding ten year age group. 
f,—is the number in the ten year age group to be broken into 
two equal groups f,a and fib 
f,—'The number in the succeeding ten year age group. 


| ] 


Ilustration—10 


Given the population of a town in three ten-year age-groups. 
It is required to estimate the number of people in 20—25 and 
25—30 age groups separately. 
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Age Group Population 
10—20 14,500 fo 
20—80 15,420 f 
30—40 13,500 f; 

На (20—25)—3( 15420-414 (14500—13500) } 

=4(15420+125) 

=$(15445) 

= ine 

f,b(25—30)—f,—f,a 
—=15420—17773 
=7647 


4—Newton Gauss (Forward)Formula—Newton-Gauss (for- 
ward) formula is to be applied when the figure to be interpolated 
is in the middle of the series and when the variation in x 
variable is by equal increments. 


The formula is— 


x(x—1) (x+1)x(x—1) 


Y= Дуо “aye ДА aos d 
(x-++1)x(x—1) (x—2) ng _ УТ 
MEN CI AE ae 


Illustration —11 


Estimate the number of living at the age 13 from the 
following position of a life table :— 


Age in years Number of living 
10 100,000 
12 99,223 
14 98,540 
16 97,843 
i ivi Differences 
pF m No. of living A "€ да 


10 | 2—1 1,00,000| y—1 нр 
122 0 99,228) y 0 —777 A’y— я 
228 | —1 
ЛЛ 241 98,540 y+1 — 683 Aty 0 +94 A y B rcs 
16 |e+2! 97,8481 y-+2'—697| A'!y--1 —14| Лу 01—108 Д*у—1 


(origin is taken just above the year for which we have to 
interpolate) 
3—12 
х= T. 5 =4=5 
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(x4-1)x(x—1) 
1x2x8 


=99223-+ (.5 X —683) үч. ср 
(.5--1).5(.5—1) 
"EE Os pat 


Ya—Yo-XA'yo- дау тр A*y—1 


х—108 
=99223—341.5—11.75--6.15 
=98,876.5 or 98,877 
The number of living at the age 13 is 98877 
5—Sterling’s Formula—Like the Newton-Gauss Formula, the _ 
Sterling’s formula is to be employed when the figure to be - 


interpolated is in the middle of the series and when the variation 
in x is by equal increments. The formula is— 


Sy gto 2 
dc EDS A 2i (22—12) A4y 2 


Illustration —12 


Using the Sterling's formula interpolate the value of: y 
when x—15 


x ОА Я TIONES. 20 
y— 50 60 75 95 120 150 
Ze aeui 33 чанарт 


12|z—1| 60 1 

142 0 75 1 5 A?y—9 

16 | 21| 95 5 A?y—1 |0|ДЗу—2 
18 | 24-2 190 5| Д2у о |0 | A%y—1. 
20 124-31 150 51 A?y--1 10 | Дзу о! 


x(x?—12) 


2 
tp Ay 


ДЗу—2-- ДЗу—1 | 
meee 


ya—JY0--XA!y—1l--A!y0 х 
TEL 
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By substituting the values we get— 


З 154-20 .25 5(.25—1) 04-0 
ух-=75-|-.5 T aX rp i 
==75-4-8.754-.6254-0 


—84.375 
The value of y when x—15 is 84.375 

6—N ewton-Gauss (Backward) Formula—This formula is to 
be used when the figure to be interpolated falls at the end of 
the series and when the variation in x variable is by equal 
increment. In this case x0 is taken succeeding the figure to be 
interpolated. The formula is 


у&=у0—х/\1у—1-- EOS Д#у—1— {шыр к 


х—1)(х—2) 


ASy—24- d Aty—2 
Illustration—13 
Estimate the value of y when x—23 from the following data. 
x— 5 10 15 20 25 30 
y— 25 32 40 4T 55 64 
Differences 
zi y A A AS At 
5] 2—4 25] y—4| 7|Аз!-4 : 


10|2—3 32| y—3| 8|^v--s llay4- 
15 | 2—2 40 | y—2 
20| 2—1 47 | у—1 
25 2—0155 0| 9л} -1 


y 
8012-11641 5-1 


ya=yO—xAty—14+ TEDE Aey р СЕН 1) лау a 


У0—55, A 1y—1—8, A2y—1 =1, ДЗу—2=0 and x—.4 

Substituting these values in the formula, we get 
4-L1) C4) (D). 42-1) (4) (.4—1) (0) 

ya—b5— (.4x(8) 4- CELO QUE тее SEA OU 

=55—3.21.28—0 


—52.08 а ЕДЫ d 
Estimated value of y, when х—28, is 52.08. 
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7—Lagrange’s  Method—Lagrange, а famous French 
mathematician whose name is made of two mathematical functions 
'LAG' and 'RANGE' has given a formula for interpolation. 
This formula is applicable in case when x advances with unequal 
intervals. The formula is 


y (х—ху) (х—х„) (хха) (хх) .. (x—x,) 
SPST Ps Cag NT my) (коа) .. (хохь) 
1939) (х—хь) (х—хз) (x—x,) . .(x—x, 


(х1—х0) G —x3) (x1 —x3) (x,0—X4) ... (хү—х„) 
(х—хо) (х—х,у)(х—х,)(х—х)... (x—3,) 

(ка хо) (хохи) (хха) (Xs—X4) . . (хи) 
ya E39) (RX) (х—хь) (x—x,) + (X—X,) d 
(X3—Xp) бах бык)... (х:—х n) 
у4®—®) (к—ху) (х—х„) (х—ху) „+.(к®—х„) _ 

(X,—X9) (х,—ху) (x4—x5) (xix) ... (X4—Xn) 


Where x stands for the figure for which interpolation is to 
be done. 


Xo, X1, Xo, Ха, ху ete and yo, Yı, Yo, уз, Y4 еіс аге the variables 
of x and y series respectively. 


Illustration—14 


The following table gives the number of income tax 
assessees in the U. P. 


Income not exceeding No. of assessees 
Rs. 2500 7166 
Rs. 8000 10576 
Rs. 5000 17200 
Rs. 7500 20505 
Rs. 10000 21925 


Estimate the number of assessees with income not exceeding 
Rs. 4000. 


(M.A. Alld.) 
а y 
2500 co 7166 4o 
3000 ay 10576 yi 
5000 E 17200 Yo 
7500 ГЛ 20505 ya 
10000 T4 21975 Ya 


&4—4000 
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The formula is— 
(хх) (x—x3) (х— хз) (x—x4) 
Ya— Yo (Xp—X1) (X0o—X2) (Xo—X3) (xo—x4) 
yl (х— хо) (х—хь) (х—хз) (x—x4) 
(x; —x9) (4 хо) Gu —x3) (х1— Ха) 
(х— хо) (x—x4) (x—x5) (x—x4) ip 
(хохо) (X3—2X4) (Xo—Xs3) (2—4) 
(х— хо) (х— хи) (х—х2) (x—X4) c 
(X3—Xp) (x3—241) (X5—X2) (Xa—34) 
(х— хо) (K—X1) (x—x2) (х— хз) lad 
(x,—x9) (х,—ху) (X4—X2) G4 —x3) 
By substituting the values we get— 
(4000—3000) (4000—5000) (4000—7500) (400010000) + 
(2500—3000) (2500—5000) (2500—7500) (2500—10000) 
10576 ee 2500) (4000—5000) (4000—7500) (3000—10000) 
(3000—2500) (3000—5000) (3000—7500) (4000— 10000) 
(4000—2500) (4000—3000) (4000—7500) (4000—10000) + 
1720075992500) (5000—3000) (5000—7500) (5000 —10000) 
(4000—2500) (4000—3000) (4000—5000) (4000— 10000) + 
20505 (1500—2500) (7500—8000) (1500—5000) (1500— 10000) 
(4000—2500) (4000—3000) (4000—5000) 


(10000—2500) (10000—3000) (10000—5000) 
(4000—7500) 
(10000—7500) 


y4 


yx 1166 — 


21975 


1000:«—10005«— 8500» —6000 
500: —2500 x — 8000 x — 1500 
1500>¢—1000><—3500<—6000 
+10576 500x 2000» —4500 x 7000 
150051000 —1000 x —6000 
2500 5 2000 »« —2500 x; —5000 
15001000 —1000 x —6000 
+20505 5000%4000х 2500—2500 
150051000 —1000><—3500 


= 7166 


17200 


21975 750037000% dep: 2500 
1 
=7166 56 10576-17200 foy 23 12520505 Fe —B 91975 195 
=—3210.37-+-10576-+8668.8—1312.324175.8 
—=—4522.69-119420.6 
= 14897.91 


Extrapolation—Extrapolation is the process of making 
definite estimates of future conditions by working on a 
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systematic and reliable basis. It depends upon practical analysis 
of the past and present conditions indicating the nature of 
probable future conditions. 1% is an estimate based on the 
assumption that there would be no sudden changes and the 
conditions in the future will not change considerably. 


"Theoretical Questions 


1—Discuss the utility of interpolation and extrapolation to 
a businessman. What are the different methods known -to you 
for interpolation ? (M. Com. Alld.) 


2—Write a note on the necessity and usefulness of inter- 
polation. What are the assumptions of the various algebraic 
methods of interpolation. 


Practical Questions 


1—From the following table, estimate by using Newton’s 
formula the premium payable at the age of 22 years. 


Premium Table to secure Rs. 100 


Age years Premium 
20 : 25 
25 28 
30 82 
35 37 
40 43. 5 
45 52.25 


(М. А. Eco. St. Delhi) 
(Ans. 26 approx.) 


2—Extrapolate the population of a town for 1966 from the 
following data abovt its population during the previous four 
censuses :— 


"Census year Population in thousands 
1931 478 
1941 468 
1951 454 
1961 484 
(М. Com. Raj.) 


(Ans. 532 appox. Lagranges method) 
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8—The annual sales of a concern are given below— 


Years Sales of cloth 
in lakhs of yards 

1935 125 

1940 163 

1945 204 

1950 238 

1955 282 


Assuming the conditions of the market to be the same, 
estimate the sales for the year 1960. 


(Ans. 380, Binomial Method) 


4—From the following data, find the number of students 
who obtained less than 45 marks :— 


Marks No. of students 
30—40 31 
40—50 42 
50—60 51 
60—70 85 
70—80 81 


(M. Com. Alld.) 
(Ans. 48 by Newton's formula) 
5— Following are the marks obtained by 492 candidates in 
a certain examination :— 
Not more than 40 marks 212 candidates 
E > > 45 M 296 » 


» ” » 75 » 492 » 
Find out the number of candidates who ein more than 


42 but not more than 45 marks. M. Com. Alld.) 
(Ans. 256 approx. Newton's method) 


6—Interpolate the missing figure in the following table with 
the help of a suitable formula :— 


1951 1881 
1959 1728 
1953 2197 
1954 Z3 

1955 3875 
1956 4096 
1957 4913 


(M. A. Delhi) 
(Ans. 1954—2744 by Binomial method) 
30 
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7—Discuss briefly thé nature and suitability of the chief? 
methods of interpolation. Estimate the annual sales of cloth for 
1955 from the following data :— 


Year. 3 Sale of cloth 
е іп lakhs of yards 
1940 250 
1945 , 285 
1950 328 
1960 444 
(M. A. Agra) 


(Ans. 380.5 Lagrange's method) 


8—Explain the methods used in forecasting the growth of 
population. The population of a certain town is given below inc 


Year Population 
1921 22,000 
1931 27,000 
* 1941 34,000 
1947 39,000 
1951 42,000 p 
(Ans. 44560, Lagrange's method) (M. Com. Арта) — 


9—From the following data estimate the number of persons - 
earning between 60 and 70 rupees :— 


Wages in rupees No. of persons 
in thousands 
Below 40 250 
-40— 60 120 
60— 80 100 
80—100 70 
100—120 50 


(M. Com. Agra) | 
(Ans. 53.6 thousands by Newton's formula) + 


10—Estimate by Newton’s method of interpolation the ^ 
expectation of life at age 29 from the following data stating the 
assumptions underlying the formula used by you :— 


Age in years 10 15 20 25 80 85 
Expectation of life | 

in years 35.4, 82.2, 29.1, 26.0, 23.1, 920.4 { 
(Ans. 27.85 years) (M. Com. Agra) 


11—The following are the sales of a departmental store at 
Connaught Place Delhi. Interpolate sales for the year 1950. 
Year —1948 1949 1951 1952 
Sales in thousand— 200 240 850 400 
(Ans. 293.8 thousand by Parabolic Curve method) 
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12—The following table gives the population of Indore city 
at the time of last six censuses :— 


1901 99,880 
1911 57,285 
1921 1,07,948 
1931 1,47,100 
1941 2,038,695 
1951 8,10,859 


Estimate the population for 1961. 
(M. Com. Vikram) 
(Ans. 5,381,859 after interpolating for 1911 as well, as 
there is sudden fall in that year) 


18—In the following table h is the height above sea-level 
and p the barometric pressure. Calculate p where #=5280. 
0, 4763, 6942, 10,593 
р= 97, 25, 23, 20 
(M. Com. Vikram) 
(Ans. 24.8 Lagrange's method) 


14—Using Newton's formula for interpolation estimate the 
population of Agra for the year 1936— 


Year Population 
1921 98,754 
1931 1,382,285 
1941 1,68,076 
1951 1,95,690 
1961 2,46.050 


(Ans. 1,47,008) 
15—Using Sterling's formula interpolate for #=35 
c 


y 
20 512 
30 439 
40 9346 
50 248 


(Ans. 895) 
16—Estimate the population of ‘City of Taj’ for the year 
1936 from the following census figures. 


Year 1901, 1911, 1921, 1931, 1941, 1951 
Population 12. 15, 20, 27, 39, 52 
(000) 


(Ans. 38.125 thousand by Newton Gauss Backward method) 


17—The observed values of a function are respectively 168, 
120, 72 and 63 at the four positions 3, 7, 9 and 10 of the 
independent variable. What is the best estimate you can give 
of the value of the function at the position 6 of the independent 
variable ? 03 

(Ans. 147 by Lagrange’s method) 
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18—Find out by interpolation from the following data the 
number of workers earning Rs. 24 or more but less than Rs. 25. ў 


Earning less than No. of workers 
20 296 
25 599 
30 804 
85 918 
40 966 ү 
(М. Сош. Арта) 


(Ans. 53 by Newton's method) 
19—Interpolate the probable number of persons earning | 


between 20 and 25 rupees from the following figures :— А 
Income in Rs. No. of person 
.Lessthan 10 150 

10—20 170 
20—30 200 
30—40 250 
40—50 


180 
(B. Com. B.H.U.) 
(Ans. 91.41 or 91 by Newton's method) 


20—Find the expectation of life at the age of 24 using the 
following data— 


Expectation 
15 32.2 
20 29.6 
25 26.3 
30 23.1 
(B. Com. Agra) | 


(Ans. 26.9 Newton's method) 


Р 21—The following are the annual premium charged by an - 
Insurance company for a policy of Rs. 1000. Calculate the 
premium payable at the age of 26. 

А 


ge Premium in Rs. 
20 23 
25 26 
30 30 
35 85 
40 42 


(B. Com. Agra) 

(Ans. Rs. 26.73 by Newton's method) 

22— If Iz represents the number living at age æ in a life - 
table, find as accurately as the data will permit lẹ for values of | 
$—35, 42 and 47 ; Given 5 

159—512, 159—489 
145—346, 155—248 


(I. A. S.) |o 
(Ans. 15394.38, 1,93—325.88, 141=274.33 by Parabolic 
curve method) 
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23—Determine by Lagrange's formula the percentage number 
of criminals under 35 years. 
% number of criminals 


Under 25 years 52 
» 90 „ 67.8 
> 40 3» 84. 1 
» 50 „ 94.4 


(М. А. Agra) 

(Ans. 77.4%, Lagrange's method) 
24— Use some appropriate interpolation method and re- 
construct the following frequency table with the intervals halved :— 


X Frequency 
0—2 85 
2—4 52 
4—6 84 


(М. А. Raj.) 
(Hint :—Convert into cumulative frequency table and apply 
Newton's. method) 
(Ans. Frequency for 0 — 1, 2—3 and 5 — 6 would be 21, 
22 and 38 respectively) 
25—The length of the day was 12 hours on March 19th, 
14 hours on April 18th and 15 hours 40 minutes on May 18th. 
Required an approximate value of (a) the length of the day on 
May 8rd (b) the mean length of the day during the period March 
19th to May 18th. 
(I.A.S.) 


(Ans (а) 14 Hrs. 53 minutes (b) 13.9 Hrs.) 
26—Estimate the probable number of persons earning 
between Rs. 30 and 40 from the following— 


Income in Rs. No. of persons 
15—20 78 
20—30 97 
30—45 110 
45—55 180 
55—70 140 


(Ans. 53 by Lagrange's method) 

27—The following figures showing the relationship between 
the amount of manure used and the yield per acre of rice were 
supplied by а research institute after several years of 


experimentation :— 
Amount of manure Yield of rice per acre 
per acre (Tons) (Mds) 
0 10 
10 15 
20 17 
80 18 


Obtain by using a suitable interpolation formula, the yield 
of rice per acre corresponding to 5 and 15 tons of manure per acre. 


(І.А.5.) 
(Ans. 18 and 16.25 mds, Newton's method) 
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_28—The following figures relate to the number of estates 
liable to estate duty in a particular year :— 


Class of estate Number liable 
Rs. 25,000—Rs. 30,000 638 
Rs. 80,000—Rs. 40,000 740 
Rs. 40,000—Rs. 50,000 415 


Estimate the number between Rs. 31,000 and Rs. 32,000 by 
interpolation. ҮР 
(М. А. Айа.) 
(Ans. 88 Newton's method) 


29—Given logio 654—2.8156 ; logio 658—2.8182 
- 10810 659— 2.8189 ; logio 661— 2.8202, 
find logy 656. (B. A. Vikram,) 
(Ans. 2.8168) 


30— The following table gives the normal weight c: а baby 
during the first six months of life : 
Ageinmonths: 0 2 3 Б] 6 
Weight іп lbs: 5 7 ОЕ 
Estimate the weight of baby at the age of 4 months. 
(М.А. Patna) 
(Ans. 8.9 lbs. approx.) 


31—Estimate the missing term in the following table. 
2: 1 2 8 4 5 6 7 
Jo A Зза 64 чов 
(B.A. Hons. Delhi) 
(Ans. 16) 


32—Estimate the expectation of life at the age of 16 years 
from the following data :— 


Age (in years) 10 | 15 20 25 30 35 
Expectation of life | 
(in years) 35.4 | 32.3 | 29.2 | 26.0 | 32.2 | 20.4 
(B. Sc. Agra) 


(Ans. 82.196) 


СНАРТЕЕ 15 


ASSOCIATION OF ATTRIBUTES 
AND CONTINGENCY 


Attributes and Variables. The statistical methods deal 
with quantitative data alone. The quantitative data may arise 
in two different ways :— 


(1) In the first place, the observer may measure the actual 
magnitude of some variable character for each of the objects 
or individuals observed. He may record, for instance, the ages 
of persons at the time of marriage, heights of students, 
expenditure of labour class ete. The observations in these cases 
are called Statistics of Variables. 


(2) In the second place, the observer may note the 
presence or absence of some attribute in a series of objects or 
individuals. There аге certain phenomena like blindness, 
insanity, deaf-mutism and alike, which cannot be measured. In 
such cases their absence or presence can only be studied. The 
quantitative character in such cases arise solely in the counting. 
Such data are called Statistics of Attributes. 

Classification. Attribute means the quality of the 
observed object. When we collect data with regard to definite 
qualities and place them in one group, it is said to be classi- 
fication according to attribute. Data are classified on the basis 
of presence or absence of particular attributes. In the simplest 
case, if only one attribute is being studied, then only two 
mutually exclusive classes are formed—one of those in whom the 
attribute is present and the other of those in whom the attribute 
is not present. If, one attribute, say literacy is being studied 
then there will be two classes—one of those who are literate and 
the other of those who are not literate. If more than one 
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attribute are studied then there will be more than two classes, If 
besides literacy, criminality is also to be studied, then there would. 
be classes of ‘Literates’ ‘Not literates’ ‘Criminals’ ‘Not criminals’, - 
‘Literate criminals’, ‘Literate not © criminals’ ‘Not literate 
criminals’ and ‘Not literate not criminals’. 


It may be noticed that the fact of classification does not 
necessarily imply the existence of either a natural or a clearly 
defined boundary between classes. If a universe is divided into 
two classes according to an attribute, say, literacy, then there is” 
no clear cut demarcation between classes of literates and not 
literates. It is rather difficult to lay down a clear cut definition 
of an attribute. The division may be wholly arbitrary. There 
is likelihood of division being vague and uncertain in such cases. 
Yule and Kendall state that, “The possibility of uncertainties of 
this kind should always be borne in mind in considering 
statistics of attributes : Whatever the nature of the classification, 
however natural or artificial, definite or uncertain, the final 
judgement must be decisive ; any one object or individual must - 
be held either to possess the given attribute or not." E | 

If only one attribute is being considered, the population is - 
divided into two classes—one in which that attribute is present 
and the other in which that attribute is not present. These 
Classes are mutually exclusive. Such classification is called 
"classification or division by dichotomy.” The classification of — 
most of statistics are not dichotomous, for most usually a class - 


is divided into more than two sub-classes, Such classification is 
called ‘Manifold classification’, 


Notation and Terminology. For theoretical purposes, it А 
is necessary to have some simple notation for the classes formed. | 
Usually capital letters A,B,C ete are used to denote the presence _ 
of attributes and Greek letters ‘a’ (alpha) ‘g’ (beta) ‘y’ (gamma) 
ete are used to denote the absence of attributes. In place of 
Greek letters we can also use small letters a,b,c ete. Thus if ‘A’ 
represents the attribute of literacy ‘a’ would represent absence 
of literacy and if ‘B’ represents _criminality then ‘b’ would 
represent absence of criminality and if “С” represents punishment - 
then ‘с’ would represent non-punishment. | 


Combination of attributes will be represented by the 
combination of letters. "Thus if “А” represents literacy and “В” 
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criminality then AB represents combination of literacy and 
criminality, Ab would represent literacy and not criminality, aB 
would represent not literacy and criminality and ab would 
represent not literacy and not criminality. 


The number of observations in different classes are called 
‘class frequencies’. Class frequencies are denoted by enclosing 
the class notation by brackets like (AB), (aB) etc. 


The total number of classes formed on the basis of the 


number of attributes studied can be known by the 3 formula 
Where n stands for the number of attributes studied. If only 
one attribute is studied, then there will be only 3! or 3 classes 
(N, A, a). If two attributes are studied, then there will be 3? 
(3x3=9) 9 classes in all (N, A, B, a, b, AB, Ab, aB, ab). If 
three attributes are studied, then there will be 3* (8Ж3Ж3=27) 
27 classes in all. 

The attributes denoted by capital letters are called positive 
attributes. The classes having positive attributes (A, AB, ABC 
ete) are called positive classes. The attributes denoted by small 
letters are called negative attributes, and classes having negative 
attributes (a, ab, abc) are called negative classes. Classes 
having a combination of both positive and negative attributes 
like Ab, aB, aBc etc are called pairs of contrary classes. 

Order of classes. The total number of observations— 
Universe or population—is denoted by N and is called the class 
of the zero order. If two attributes are studied say A,B, then 
A,B,ab would be classes of the first order and AB, Ab, aB, аЬ 
would be the classes of second order. If there are no further 
classes of higher orders they would also be called the 'classes of 
ultimate order’. If three attributes are studied then ABC, ABc 
ete would be classes of the third order, and if no further order 
exists then these classes would become classes of ultimate order. 
The total number of classes of utimate order is determined by 


the formula 2 where ‘n’ stands for the number of attributes 
studied. If two attributes are studied then the number of 
classes of ultimate order would be 2 (22) 4. In case three 
attributes are studied then there would be 23 (222) 8 classes 
of ultimate order. 
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This can be illustrated as given below :— 


Order 0 N б 
Order 1 (А) (В) (С) 
(а) (b) (e) 
"Order 2 (AB) (AC) (BC) 
+ (Ab) (Ас) (Ве) 
(аВ) (aC) (bC) 
(ab) (ас) (be) 
Order 8 (ABC) (aBC) 
(ABc) (аВс) 
(AbC) (abC) 
(Abe) (abe) 
N 
| 
| | 
TURCA & 
| MEG | 
AB Ab ] | 
ү ; чүт | T 
р | E, 
ABC ane he | | ПП 
| abC abe 
| H 
авс аВе 
-On the basis of this classification : 
N=(A)-+(a) 
—(B)-L(b) 
(А)=(АВ)- (Ab) 
(a)=(aB)+-(ab) 
(B)=(AB)-+(aB) 
“ (b)=(Ab)-+- (ab) 


If there are three attributes then 
(АВ)=(АВС)-| (ABc) 
(Ab)=(AbC)-+ (Abe) 
(aB)=(aBC)-+ (aBc) 
(ab)— (abC) + (abe) 
Similarly other relationships can also be found out. 
Illustration —1 


From the following ultimate class frequencies, find the - 
frequencies of the positive and negative classes and the total - 
number of observations :— . ; 

(AB)—100 (aB)—80 
(Ab)— 50 (ab)—40 
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We have to find out— 
N, (A) (B) (a), and (b) 
N=(A)-+(a) 
—(AB)-- (Ab) 4- (aB) 4- (ab) 
—100-1-50-1-80-1-40 
=270 
(A) =(AB)-+ (Ab) 
=100-+50=150 
(а) =(aB)-+ (ab) 
—80-L40—120 
(B) =(AB)+ (aB) 
—100--80—180 
(b) =(Ab)-+(ab) 
—50--40—90 
Another method of finding the frequencies is with the help 
of nine-square table is given below :— 


A a 
B (AB) (aB) (B) 
100 80 180 
b (Ab) (ab) (b) 
50 40 90 
(А) (а) N 
150 — 120 270 


Illustration —2 
Given the following frequencies of the positive classes, find 
the frequencies of the ultimate classes :— 
(A)=160, (B)—200, (AB) =140, N=500 


We have to find out. 
Ab, aB, and ab 

(Ab)=(A)— (АВ) 
—160—140—20 

(aB)—(B)— (AB) 
—200—140—60 

. (ab) = а —(aB) 

—(N—A)—(B—AB) 


—N—A—B+AB 
=500—160—200-+-140 
=280 


Illustration—3 


A number of labourers in a factory were examined for the 
presence or absence of certain defects of which three chief 


descriptions were noted :— 
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A—Physical weakness 
B—Nerve signs 
C—Mental dullness 


Given the following ultimate frequencies find the frequencies _ 
of the positive classes including the whole number of 
observations N :— 


(ABC)= 75 (aBC)= 98 
(ABc) =310 (аВе) = 702 
(AbC) =106 (abC) = 74 
(Abc) —489 (abc) —4815 
We have to find out— 
N (AB) 
(A) (АС) 
(В) (ВС) 
(С) (АВС) Given 
(A)=(ABC)-+ (АВе)-+ (AbC) + (Abc) 
—'(5-1-310-1-106-L-489 
—980 
(В) = (ABO) 4- (ABc)-+ (aBC) + (аВе) 
—'(8-1-310-1-98-1-702 
—1185 
(€)— (ABO) + (AbC) + (aBC) + (abC) 
—'15-L106-1-98.L. 74 
=—958 
(АВ) =(АВС)--(АВе) 
= 75-310 
=885 
(AC) =(ABC)--(AbC) 
= 75106 
—181 
(ВС) (АВО) F (aBC) 
== 75-198 
=178 
(ABC)= 75 (already given) 
Illustration—4 


Given the following frequencies of the positive classes find 
out remaining class frequencies. 


N=12000 (AB)—453 
(А)= 977 (AC)—284 
(B)— 1185 (BC)=250 
(C)= 596 (ABC)—127 


We have to find out : 

(a), (b), (с), (Ab), (aB), (ab), (Ac), (aC), (ac), (Be), 
(bC), (be), (ABc), (АБС), (Abe), (aBC), (аВе), (abC), 
(abc) ; 
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(а) 2N—(A) 
—12000——977—11023 
(b) =N—(B) 
—12000—1185—10815 
(c) ZN— (С) 


—12000—596—11404 
(ABc)= (AB) — (ABC) 
—453— 127—326 
(AbC) = (AC) — (ABC) 
—284—127—157 
(aBC)— (ВС) — (ABC) 
—250—127—123 
(Аъе) = (Ab) — (AbC) 
—(A)—(AB)—(AbC) 
—977—453—157=367 
(аВс) = (aB) — (aBC) 
=(B)—(AB)—(aBC) 
—1185—453—123—609 
(abC)— (bC) — (AbC) 
(C) — (BC) — (АЬС) 
—596—250— 157—189 
(Аъ) = (AbC) + (Abe) 
—151-1-367—524 
(aB)= (aBO) + (aBc) 
—123-1-609—732 
(Ас) = (АВе) + (Abc) 
—326--367=698 
(aC) = (aBC) + (abC) 
—123- 189—812 
(Bc)— (ABo) + (aBe) 
— 326-1-609—935 
(bC) = (AbC) + (abC) 
—157--189=346 
(abe) = (ab) — (abC) 
= (b) — (Ab) — (abC) 
—10815— 524—189 
—10,102 


(ab)— (abC) + (abe) —189-1-10102—10291 
(ac) = (аВе) + (abe) —609-1-10102—10711 
(be) = (Abc) + (abe) =367-+-10102—10469. 


ATI 
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Illustration—5 т 


After an air-raid a survey was made of the local hospital 
specially reserved for those injured in the raid. There were 
1200 beds in all. 600 patients were found to have fractured 
their arms, legs and skulls as a result of the bombing. There 
were 50 patients who had no injury over their body but were 
admitted only to provide speedy recovery from the shock of 
the raid. Patients with fractured arm had a majority of 192 
over those with no arm injury. The number of patients who 
had escaped head injury was 270. There were 36 patients 
with a fractured arm but had fortunately no injury in their 
legs. Similarly there were 204 patients with a fractured leg, 
but without any head injury. The majority of those with a 
fractured leg over those without injury to their legs was 620. 

Make an analysis of the injury of patients according to the 
injuries received by them. 

(M. Com., Vikram) 

Let A stand for injury to arm and ‘a’ for non injury 

B stand for injury to leg and ‘b’ for non injury 
C stand for injury to skull and ‘ce’ for non injury 


Then the above data can be stated as— 


N—1200 (c)—210 
(ABC)= 600 (Ab)— 86 
(abe)— 50 (Be)—204 
(A)—(a)— 192 (В) — (b)—620 


We have to find out the remaining ultimate class frequencies 
viz—(ABc), (AbC), (Abe) (aBC), (aBe), (abC). In-order to 
find them out we have to first calculate the frequencies of 
positive classes viz—(A), (B), (C), (AB), (AC), (BC) 


(1) First order Frequencies 
N=(A) +(a)=1200 


(A)—(a)— 192 
2 (A) —1392 
or (A) — 696 (No of persons with fractured 


arms) 
N=(B)-+(b)=1200 
(B)—(b)= 620 


2 (B) —1820 
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or (B) = 910 (No. of persons with fractured 
legs) 
(C)=N— (с) 
—1200—270 
= 930 (No. of persons with fractured 
skull) А 


(ii) Second order frequencies 
(AB)—(A)— (Ab) 
—696—36 
—660 (No. of persons with fractured arms and 
legs) 
(BC)— (B) — (Ве) 
—910—204 
— 706 (No. of persons with fractured legs and 
skulls) 
(АС) (abe) —N-} (A)-+(B) 4-(C) — (AB) —(BC)-++ ABC 
c 50—1200-1-696-1-910-1-930 —660— 706-600 
—620 (No. of persons with fractured arms and 
skulls) 
(iii) Remaining ultimate class frequencies 
(ABc) = (AB) — (ABC) 


—660—600—60 (No. of persons with fractured 
arms and legs but not skull) 


(AbC)— (AC) — (ABC) 
=620—600—20 (No. of persons with fractured 


arms and skulls but not legs) - 


(aBC) — (BC) — (АВС) 
—'106—600—106 (No. of persons with fractured 
legs and skulls but not arms) 


(аВе) = (aB) — (aBC) 
—(B)— (АВ) — (aBC) 
—910—660— 106—144 (No. of persons with frac- 
tured legs but not arms 
and skulls) 


(Abc) — (Ab) — (AbC) 
—(A)—(AB) — (AbC) 
—696—660—20—16 (No. of persons with frac- 
tured arms and not legs 
and skulls) 
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(abC)— (bC) — (AbC) 
=(C)—(BC)—(AbC) 
=930—706—20—204 (No. of persons with frac- - 


tured skull but not arms | 


and legs) 
Illustration —6 


There were 400 students in the B. Com. class of a University. 
Their results in the various terminal examinations are given _ 
below :-— 

I Terminal—passed 180 

II Terminal—passed 140 

III Terminal—passed 180 

60 passed in all the terminals ; 
80 failed in all the three ; 


40 passed in the Ist and IInd terminals and failed in Г 
the Шуа. ; 


70 failed in the Ist and IInd terminals and passed in 
the IIIrd. 


Find out how many students passed atleast two examinations. 
: (M. Com. Vikram) 
Let Success in Ist terminal be denoted by A and failure by ‘a’ 
Success in IInd terminal be denoted by B and failure by ‘b’ 
Success in IIIrd terminal be denoted by C and failure by ‘œ’ 
"Then the given data 
N= 400 
(A)= 180 
(B)=140 
(C)=180 
(ABC)= 60 
(abe)= 80 
(АВе)= 40 
(abC)— 70 
We have to find out the values of :— 
(ABC) 4-(АВе) + (AbC) + (авс) 
Now— 
(aBC)-- (AbC) (АВС) 4- (abC) —C 
(aBC) -- (AbC) — (C) — (ABC) — (abC) 
=180—60—70 
= 50 


| 
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(ABC) 4- (ABo) J- (AbC) + (aBC) 
—60-1-40-1-50 
—150 
Thus the number of students who passed atleast two 
examinations is 150. 


Illustration—7? 


Measurements are made on a thousand husbands and a 
thousand wives. If the measurements of the husbands exceed 
the measurements of the wives in 789 cases for one measure- 
ment, in 741 cases for another and in 690 cases for both 
measurements, in how many cases will both measurements on the 
wife exceed the measurement on the husband ? 


Let A stand for the cases in which measurements of 
husbands exceed those of wives in one measurement and 'B' 
those in the other measurement. 


Then N—1000, (A)—789, (В)=741, (AB)=690 
We have to find out (ab) 


ab—a—aB 
or ab—(N—A)—(B—AB) 
or ab—N—A—B-+AB 
=1000—789—741+690 
—160 


Thus the number of cases in which both measurements on 
the wife exceed the measurement on the husband is 160. 


Illustration—8 
In a war between white and Red forces there are more Red 
soldiers than white ; there are more armed whites than un- 
armed Reds, there are fewer armed Reds with ammunition 
than unarmed whites without ammunition. Show that there are 
more armed Reds without ammunition than unarmed Whites 
with ammunition. (Yule and Kendall) 
Let ‘A’ stand for a white soldier and ‘a’ for Red soldier 
‘B’ stand for armed and ‘b’ for unarmed 
and “С” stand for possession of ammunition 
and “с' for non-possession of ammunition. 
Then the data can be denoted as 


(a3) 5 (А) (a) 
(AB) » (ab) (b) 
(Abc) > (aBC) > (с) 


We have to prove that (aBc)>(AbC) 
31 Р 
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From (a) considering the dichotomy of each side according 
to ‘B’ we have 


(а) 2 (A) 
ог (aB)-+-(ab) > (AB) -- (Ab) 
As (AB) is greater than (ab) as given in (b) if (ab) is 
Substituted for (AB) than it will become still greater— 


ог (aB)-+(ab)>(ab)-+ (Ab) 
or (aB)>(Ab) ( (ab) is common) 


From this, considering the dichotomy of each side according 
to С we have 


(aBC)-+(aBe) > (AbC)-+ (Abe) 


As (Abc) is greater than (aBC) as given in (с) then if 

(aBC) is substituted for (Abc) than it will become still greater— 
(aBC) 4- (aBe) > (AbC) 4- (aBC) 

(aBC) is common 


hence (аВе) > (АЪС) 
Q.E.D. 


Consistence of Data. The statistics of attributes are 
obtained by counting and therefore no class frequency can be 
negative. If any class frequency has a negative value, data 
are said to be inconsistent. Such inconsistency may be due to 
wrong counting or inaccurate additions or subtractions or may 
be due to misprints, 


In order to test the consistency of data we have to see 
whether any class-frequency is negative or not. If there is 
no class frequency having negative value, then the data is said 
to be consistent otherwise inconsistent. It should be however 
remembered that consistence of data is no proof of accurate 
counting or printing, though the inconsistence of data is a sure 
proof of data being incorrect. To find out consistency of data, 
ultimate class. frequencies should be found out, because if there 
is any inconsistency, one or more ultimate class frequencies will 
be negative. To find out inconsistency the following tests are 
applied :— 


I? only one attribute is studied 


(1) (A) tO otherwise (‘A’) will be negative 
(2) (A) PN otherwise (a) will be negative 
because N=(A)+-(a) 
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If two attributes are studied 
(3) (AB) {О otherwise (AB) will be negative 
(4) (AB) }A otherwise (Ab) will be negative 
because (A) —(AB)--(Ab) 
(5) (AB) $B otherwise (aB) will be negative 
because (B) —(AB) 4-(aB) 
(6) (AB) < (A)--(B)—N otherwise (ab) will be negative. 
because (ab) (а) — (aB) 
or (ab)—N—(A)— (В)—(АВ) 
or (ab) = N—(A)—(B) J-(AB) 
or — (AB)— N—(A)—(B) — (ab) 
ог (AB)——N--(A)4-(B) J- (ab) 
or (AB)—(A)-2-(B) —N-F(ab) 
If (AB) is less than (A)--(B)—N it is obvious that (ab) 
will be negative 
If three attributes are studied 
(7) (ABC) $0 otherwise (ABC) will be negative 
(8) (ABC) + (AB)+(AC)—(A) otherwise (Abc) will 
be negative 
because (Abe)= (Ab)—(AbC) 
or (Abc)— (A)—(AB)— {(AC)—(ABC)} 
or (Abe) = (A)—(AB)—(AC)+(ABC) 
or — (АВС) = (A) —(AB)—(AC) — (Abe) 
or (АВС)— (АВ) 4-(АС) — (A) J- (Abe) 
Now if (ABC) is less than (AB)+(AC)—(A) it is obvious 
that (Abc) will be negative 
(9) (ABC) 4 (AB) 4-(BC) — (B) otherwise (aBc) will 
be Negative 
(10) (ABO) $ (BC)+(AC)—(C) otherwise (abC) will 
be Negative 
(11) (ABC) > (AB) otherwise (ABc) will be negative 
because (АВ)=(АВС)--(АВе) 
otherwise (AbC) will be negative 
because (AC) —(ABOC) J-CAbO) 
(13) (ABC) РВС otherwise (aBC) will be negative — 
because (ВС) — (ABO) --(aBC) 


(14) (АВС) (AB) ++ (BC) 4-(АС) — (A) — (B) — (C) EN 
otherwise (abe) will be negative 


, because (abe) = (ab) — (abC) 


(12) (ABO) $ (АС) 


or (abc)— (a)—(aB)— {(bC)—(AbC)} 
or (abe) = (а) — (aB)—(bC)-+ CAbO) 
or (abe) —1 N—(A)}-{(B)—(AB)} - {(C)— (BO)] 
--(AC) — (ABC) 


or ^. (abe) =N—(A)—(B) + (AB) — (0) + (BO) + 
(AC)—(ABC) i 
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or — (ABC)—(AB)-F (AO) 4-(BO) — (A) — (B) — (C) 4- 
N— (abc) 


Now if (ABC) is more than (АВ) --(AC)-- (BC) — (A)— 
(B)—(C)-LN, it is obvious that (abc) will be negative. 
From the above rules, it is clear that when three attributes 
are studied then 
(ABC) +0 
(ABC) $(AB)-+ (AC) -+ (BO) — (A) — (B) — (С) J-N 
The upperlimit of (ABC) cannot be less than the lower 
limit. Therefore, 
(AB) -- (AO) --(BC) — (A) — (B) — (C) J-N «4 O 
ог (АВ) 4-(АС)4-(ВС) ¢(A)+(B)+(C)—N 
(15) (AB)+(AC)-+ (ВС) 4: (A) -- (B) -(O) —N 
Similarly if we combine the other sets of lower and upper 
limit then 
(АВС) < (AB) (АС) —A 
(ABC) (ВС) 
Непсе 
(АВ) (АС) — (A) (BC) 
ог (АВ) 4-(АС) — (BO) - А 
(16) (AB)--(AC) — (BO) $4 
(17) (AB)—(AOC)--(BO) $B 
(18) —(AB)--(AC) (BO $C 


Illustration—9 


If a report gives the following frequencies as actually 
observed, show that there must be a misprint or-mistake of some 
sort, and that possibly the misprint consists in the dropping of a 1: 
before the 85 given as the frequency (BC) 


i 


і N=1000 ; 
(A) —510 (AB) —189 
(B)—490 (AC)—140 
(С) =427 (ВС) = 85 


(АВ) 4-(АС) (ВО) 4 (A)-++(B)-+(C)—N 
or (BC) 40А) 2-(B) -(C) — (AB) — (AC) —N 
or (BC) 4510--490--427—189—140—1000 
or (ВС) + 98 


but (ВС)=85<98, therefore it is not the correct value of 
(BC). The value cannot be less than 98. If the figure is made 
185 then all the conditions are fulfilled. | 


! 


Illustration—10 І i 


A market investigator returns the following data of 1000, 
people consulted, 811 liked chocolates 752 liked toffee and 418 


д 
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liked bolied sweets ; 570 liked chocolates and toffee, 356 liked 
chocolates and boiled sweets and 348 liked toffee and boiled 
sweets ; 294 liked all three. Show that this information as it 
stands must be incorrect. 


(M. Sc. Agra) 
Let A be denoted for liking of chocolates 


B be denoted for liking of toffee 
C be denoted for liking of boiled sweets ; then the data as 


returned would be J 
N=1000 (B) =752 , (C) =418 
(А)= 811 , (AC) —356 , (BC) =348 
(АВ)= 570 , (АВС)= 297 


Now 


(ABOYP (AB)--(AO)--(BO) — (A) — (B) —(O) +N 
or (ABO) 510--356--348—811—152—418--1000 


or (ABC)} 293 
But (ABC) is greater than 293, hence data are inconsistent. 


Illustration—11 


ТЕ in a village actually involved by anthrax, 70 per cent of 
the goats are attacked and 85 per cent have been inoculated with 
vaccine ; what is the lowest percentage of the inoculated that 
must have been attacked ? 

Let A stands for attack of anthrax and B stands for inoculation 
Then 
N=100, (A)=70, (B)=85 

We have to find out lowest percentage of (AB) 

(AB) +O 

(АВ) + (A)-+-(B) —N=70485—100=55 

Hence the value of (AB) cannot be less then 55. 

Hence the lowest percentage of the inoculated that must 
have been attacked is : 

(АВ), 55 
A 0100 — )100—65 
(py 1090—85 X % 


Illustration—12 


If (A)—50, (B)—60, (C) —80, (AB)=35, (AC)—45 and 
(BC) —42, find the greatest and least possible values of (ABC) 
ML (М. бе. Agra) 
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(ABC) +0 

(АВС) +(AB)-+ (AC) — (A) —35--45—50—30 

(ABO) < (AB) -- (BC) — (B) —35--42—60—17 

(ABC) + (BC) + (AC) — (0) —42-1-45—80— 7 

Therefore the least value of (ABC) —30 

(ABC) +(AB)—35 

(ABC) + (AC) —45 

(ABC) > (BC) —42 

Therefore the greatest value of (ABC) —35 

Hence 30< (АВС) <85 
Illustration —12A 

If in a collection of houses actually invaded by small pox, 
10 per cent of the inhabitants are attacked and 85 per cent have 
been vaccinated, what is the lowest percentage of the vaccinated 
that must have been attacked. (I.A.S.) 

Writing A to denote inhabitants invaded by small рох, B to 
denote those vaccinated, the data are: 

(A)=70, (B)—85, N—100 

First we find the lowest value of (AB). 

From the condition of consistence, we have 
} (AB) ¢O 
i.e., (AB) > (A)--(B) —N 

=70-+-85—100 
255 
The lowest value of (AB)—55 
Hence the lowest percentage of inhabitants vaccinated which 


have been attacked 


_ (АВ) 


= (ву X100 


55 
—gg Х100—64.7 рег сепї, 


Illustration —13 


Show that for n attributes А Б, CES TERT 
M (ABO... M)« (A)--(B)--(0O)...... (М) — (n—1)N. 
Where N is the total frequency. 
(M. 8с. Agra) 
(AB) (A)--(B) —N (already proved) 
(AC) 4 (A) (С) —N 
(BC) < (B)--(C) —N 


(АВС) (AC) J-(BC) —(C) 
or > (A)--(O) — N-- (B) -(0) -N— (C) 
or > (A) J- (B) (С) —2N 


and 
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Applying this to the univere D we get— 
(ABCD) < (AD)+(BD)+(CD)—2(D) 
+ (A) +00) —N4- (B) + (D) —N4-(O) - D) —N —2(D) 
+ (A)-- (B) 4-(O) -- (D) —3N 


Hence we see that 
(АВС... M) (А)-Е(В)-Е(С)-Е(®)...... (M)— (n—1)N. 
Illustration—14 


In а very hotly fought battle 70 per cent at least of the 
combatants lost an eye, 75 per cent at least lost an ear, 80 per 
cent at least lost an arm and 85 per cent at least lostaleg. How 
many at least must have lost all four. 

(M. Com. Alld.) 


Let N—100 
Let А, В, C, and D denote respectively losing an eye, an 
ear, an arm and a leg. Then 
(A)=70, (B)=75, (C)=80, (D)=85 
We have to find out (ABCD) 


(ABCD) ¢ (A) -- (B) --(O) -(D) — (0n—DN 
4 (A)--(B)2-(0) 4-00) —(4—DN 
$10-15--80--85—8Ж100 
4310—3800 
$10 


Hence at least 10% lost all the four. 
Illustration—15 
Given that 
(A)=(a)=(B)=(b)=3N 
Prove, (AB)=(ab), (Ab) =(aB) 
(A) — (AB) --CAb) 
(B)=(AB)-+ (aB) 
(A)=(B) Therefore 
(AB)-+ (Ab) 2 (AB) (aB) 
(Ab)= (aB) 
(A)=(AB)+(Ab) 
(a)=(aB)+ (ab) 
(А)=(а) 
Or (AB) -- (Ab) — (aB) 4- (ab) 


(AB) -+(aB)=(aB) + (ab) 
(AB) = (ab) 


(AB) is Common 


(Ab)=(aB) Proved, 


(aB) is Common 
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Illustration—16 


Given that 
(A)—(a)— (В) — (b) —(0)— (c) 23N 
and (ABC)— (abc) 
Prove that 
2(ABO)—(AB)--(BC)--(AC) —1N. \ 
(М. 5с. Agra) 


(abe) = (AB)-- (AC) -- (BC) — (A) — (B) — (C) --N 
— (ABC) 
or (ABO) — (AB) (АС) (ВС) — (A) (В) — (O)--N ы 
к — (АВС) 
ог 2(АВС) = (АВ)-- (АС) (ВС)—(А)—(В)— (C)-EN an 


or 2(ABO)— (AB) (AC) (BC) — E moe 


or 2(ABC)=(AB)+ (AC)-+(BC)—4N. Proved. 


Association of Attributes, Statistical data are generally of 
two types—one that can be measured quantitatively e.g. height Е 
weight ete. апа the other which are of descriptive character | 
and cannot be measured in figures e.g. deafness, dumbness ete. 
When the data can be numerically expressed, the method of 
correlation is employed to find out the relationship existing 
between the two variables. If it is desired to investigate the 
relationship bteween the data of a descriptive character—known 
as attributes, the method of association is resorted to. In other 
words, the study of relationship between the attributes of any 
two or more variables is known as Association in Statistics. 
The difference between correlation and Association is therefore, 
that the term correlation is applied to the study of relationship 
between two or more variables where it can be quantitatively 
measured, while the termi Association refer to the study of 
relationship between such variables which cannot be measured 
in terms of figures. Often we are interested in studying the T 
association between two attributes. The word ‘ASSOCIATION’ 
has a technical meaning in statistics, Two attributes are said 
to have an association, if they appear together in a larger 
number of cases than is to Бе expected if they are independent. 
If there is no association of any kind between two attributes 
A and B, we expect to find the same proportion of A’s amongst 
the B's as amongst the Not B's. 


In order to find out association between two attributes, it 
becomes necessary to find out the expected number of their 
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simultaneous occurrence. According to the theory of Probability 
the expectation of a particular event is 
Number of favourable cases 
„ВОТЬ ДЕ eere le 
=Total number of cases xnumber of observations. 
Thus if a coin is tossed the probability that it will fall head 
upward in 100 such tosses is с 
4100—50 
On the basis of this theory 
the probability of (A) is ge 


N 
and that of (B) is (В) 
N 
The combined probalility of (A) and (B) is 
(A). В). 
N N 
and the expectation of (A) and (B) jointly is 
(А), (B) (A) x (B) 
С х AN hee 


The expectation of (ab)— ae 


^ 


Similarly expected number can be found out for others 
also. The association may be positive or negative. Two 
attributes A and B are positively associated if the number 


of actual observations of (AB) is more than the expected 


number. If the number of actual observations of (AB) 


is less than the expected number then there is negative 
association between them. On the other hand if the number 
of actual observations of (AB) is just equal to that of the 
expected, then bbth attributes are independent. Hence. 


Attributes Independence Association Negative Asso- 
Positive ciation or 
Disassociation 
B A B 
Tuiran OD двух DEO an XO 


A and b (Ab) = xO) (Ab) > xO). (Ab) < we 
b a 
(9X6) | > ХО аву < 


aandB (aB) = 


aandb (ab) = 
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Illustration —17 
Show whether A and B are 
associated or negatively associated in 


G) N=750, (A)—120, (B)—250 and (AB)— 40 
Gi) N=600, (A)—120, (В)—500 and (АВ)—260 
(ii) N=800, (А)—300, (В)—400 and (AB)—100 
: =(A)X(B) 120x250 
(1) (AB) = 750 —=40 


This is equal to the actual observation of (AB) hence А : 


and B are independent. 
РТ __(А)Х(В 120x500 
Gi) (AB EN 600 


Actual observation of (AB) is 260 which is greater than 
100, hence in this case A. and B are positively associated. 


(iii) (Ag) . XG) pM —150 i 


=100 


Actual observation of (AB) is 100, which is less than 150, ; 


hence in this case A. and B are negatively associated. 


If all the ultimate class frequencies are given, association 
сап be found out by the following formula :— 
(AB) X (ab) — (Ab) X(aB) 
If it is equal there is no association ў 
If (AB) x (ab) > (Ab) x (aB), there is positive association. 


If (AB) X (ab) < (Ab) x (aB), there is negative association. 
It ean be proved— 


(AB) X (ab) —(Ab) x (ав) 
(AB) x (ab) - CX QD. „ Сао 


_(A)X(b)* (а) (B)* 
ре ры ани 


—(Ab) X (aB) 
dlustration—18 


In an anti-malaria campaign in a certain area, quinine was 
administered to 812 persons out of a total population of 3,248. 
The number of fever cases is shown below :— 


Treatment Fever No Fever 
Quinine 20 792 
No-Quinine 220 2216 


independent, positively E 


the following cases ;— 
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Discuss the usefulness of quinine in checking malaria. 
(U. P.C.S.) 


Denoting А for Quinine treatment 
a for No Quinine treatment 
B for No attack of fever 
and b for attack of fever. 
Then— 
(AB)—792, (Ab)—20, (aB)—2216, (ab) —220 
To test association 
(AB) X (ab) — (Ab) X (aB) 
—(192x220) — (2052216) 
—114240 7 44320 
Hence there is positive association between A and B. It 


means quinine is helpful in checking malaria. 

We can look at this problem from another point of view also. 
The percentage of persons attacked by fever when quinine was 
given is— 

(Ab) 

A 

and percentage of persons attacked by fever when quinine was 
not given | 


_ (ab) _ 220 я 
=” X100= 9496 X 100=9 


A comparison of these two percentages show that quinine 
is useful in checking malaria. 

Coefficient of Association. By the above method we can 
simply have a rough idea about the association between two 
attributes. The degree or extent of association cannot be found 
out. To know the extent of association Prof. Yule has given a 
formula for calculating coefficient of association. The formula 
is— 


x100— ag X100—25 


c (AB) (ab) — (АБ) (aB) 
= (AB) (ab) J- (Ab) (aB) 

The coefficient of association (Q) varies between + 1, and it 
is interpreted like coefficient of correlation. If the result is 0, 
there is no association between two attributes, when the result 
is +1, there is perfect positive association and in case of —1 
there is perfect negative association. 
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Illustration—19 


Investigate the association between darkness of eye colour 
in father and son from the following data :— 


Fathers with dark eyes and sons with dark eyes 100 
Fathers with dark еуез and sons with not dark eyes 158 
Fathers with not dark еуез and sons with dark eyes 178 


Fathers with not dark еуез and sons with not dark eyes 1564 
(M. Com, B.H.U.) 
Let А denote fathers with dark eyes 
a denote fathers with not dark eyes 
B denote sons with dark eyes 


' b denote sons with not dark eyes. 
Then 


(AB)—100, (Ab)—158, (aB)=178, (ab)—1564 


Q— (AB) (ab) — (Ab) (aB) 
—. (AB) (ab) (Ab) (aB) 
By substituting their respective values we get— 
Q— (1001564) — (158178) 
~ (1001564) + (158178) 


1,28,276 
54524 = +0.695 


There is positive association between eye colours of fathers 
and sons, T 


Coefficient of Collignation. Prof. Yule has given another 
important formula known as 


coefficient of collignation. The 
coefficient of collignation is represented by y and its formula is 


_ „|(А5)х(@аву_ 

я VE AA. 

НЕТИ АЛЕ hi 
(Ab) x (aB) 

1+ Ухо. 

In the above illustration у will be 


1L |1585 178 
10051564 


у= 


158X 178 is 
1+ 10051561 
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нк 
156400 5g 
= hae 4971-4 (арргох.) 
1 28124 ^ 
+ x. 
156400 


From the coefficient of collignation Q can be computed by the 
following formula— 


2y 


its 
In the above illustration 
_ 2x4 Basis 
Q— 14 ER RUNE 


Partial Association. So for we have discussed association 
between two attributes in an universe, without any regard for 
other attribute or attributes that might be existing in the same 
universe. Such association is called—total association. The . 
association between two attributes in sub-universe is called 
partial association. 


Partial association can be found out by the following 
method— 
A and B are positively associated in the sub-universe © if 
(AC) X (BC) 
(ABC)> (C) р 


Тһеге will be negati Me if 
АС) x (BC) 
(ABO) 0) 
A and B are positively associated in the sub universe ‘œ if 


(Ac) X (Be) 
(АВе) > су 


There will be negative association if 
A Be 
(ABe) <! охо 
Coefficient of Partial association between А аһа B within 


v, сое 
C T "a kn oz (ABO) (abC) — (AbC) x (aBC) 
Q AB.C=7 RG) (abC) — (AbC) x (aBC) 


RA өлүүнү will be 
and within ‘с’ universe (ABc) (abe) — (Abc) (aBe) 


Ө АВ. АВеу (abe) + (Abe) (aBe) 
- Шазогу iation. This peculior result indictes that, 
although a set of attributes independent of A and B will not 
affect the association between them, the existence of an attribute 


494 AN INTRODUCTION TO MODERN STATISTICS 


C with which they are both associated may give an association 
in the universe at large which is illusory in the sense that it 
does not correspond to any real relationship between them. If 
the associations between A and C, B and C are of the same sign, 
the resulting association between Ai and B will be positive, if of 
opposite signs, negative. Misleading associations may easily 
arise through the mingling of records which a careful worker 
would keep distinct. If association between sub-populations is 
not studied separately, misleading conclusions are likely to be 
drawn. 

Contingency. So far we have dealt with dichotomous 
classification. А universe may also be divided into a number of 
parts by a similar process. Thus attribute ‘A’ may be sub- 
divided into Ау, А., Аз ею. Similarly another attribute В may 
also be sub-divided into B,, В„, B, ete. Such classification is' 
arranged in the form of a contingency table. 


Contingency Table showing the Temperament of 
Brothers and Sisters 


Sisters (B) 
Brothers 
(A) Quick | Good natured | Sullen Total 
(B1) (Be) (Bs) 

Quick (A) | (AB) | (АВ) (АзВз)| (Ау) 
Good natured (As) | (АВ;) (A2B2) (A2B3)| (А) 
Sullen (A) (A3B1) (АзВь) (АзВз)| (Аз) 

Total (B1) (B2) (Вз) N 


| For finding out association in such tables the easiest, way 
is to convert them into 2X2 tables by merging the various groups 
as shown below. 


Parent 

Off Spring — — ———— 

Very Tall| Tall | Medium | Short | Total 
Very Tall 20 30 20 2 72 
Тап Er 125 85 12 286 
Medium AED 140 165 125 * 488 
Short sn 37 _ 68 151 259 
Total - 40 332 |. 338. 290 1000 

ВЫ LIA Sees 
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The above table can be converted into 2X2 table by merging 


Very Tall and Tall, and Medium and Short classes. The table 
will then become— 


Parent 
Off spring Tall (B) Short (b) Total 
Tall (A) 189 (АВ) 119 (Ab) 308 
Short (a) 183 (aB) 509 (ab) 692 
Total 372 628 1,000 


Now coefficient of Association can be calculated as— 


_ (AB) (ab) — (Ab) (aB) 
Е ore 11 
T | X509) — (188119) 
by substituting the values— (189 5:509) + (183x119) 
. 96201--21777 174424 
— 96201—21777 ^ 117978 
=+ 0.68 
Coefficient of Contingency. In order to caleulate the 
degree of association between A's and B's on the whole we cam 
caleulate Karl Pearson's Coefficient of Mean Square Contingency— 
generally known as Coefficient of Contingency denoted by ‘С’. We 
proceed with the assumption of null hypothesis i.e. the two 
attributes are independent and there is no association between 
them. .On this assumption the expected values of different cell 
frequencies like (A,B,), (АВ) ete are calculated with 
(Ay) x (By) 
N 
frequency of each cell is equal to the expected frequency of that 
cell, both attributes are independent. If these values are not 
equal, then there is association between two attributes. In order 
to test the intensity of association, the difference between the 
actual and expected frequencies of various cells is found out. 


If these differences are squared up and then divided by the 
respective expected frequency and total of such values is known 
as X? Chi-Square (Pronounced аз Ki-Square). In symbolic 
form. ? 


9 


the formula (A,B,)— and so on. If the observed 


$ (Difference of actual and expected сеа) 
E E Expected frequencies 
or 
Х— gs 
Where f —actual frequency 
\ f,;=Expected frequency. 
| This is also called ‘square contingency’. 
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Caleulate coefficient of contingency from the following 


table :— 
Parent 
Off Spring Very tall Tall Medium Short Total 
Very tall 20 30 20 2 72 
Tall 14 125 85 12 236 
Medium 3 140 165 125 488 
Short 3 87 68 151 259 
‘Total 40 882 888 290 1,000 
(1.C.8.). 
Parent B 
Off Spring 
А Verytal | Tall | Medium | Short Total 
(B1) (Bz) (Bs) (B4) 
20 30 20 2 72 
„Very tall (А) | (АзВ:) | CA1B2) | (АзВз) (ABa) | (Ax) 
14 125 85 12 236 
"Tall (А) | (AsBi) | (AsB2), (АВз) (A2B4) | (A2) 
3 140 165 125 488 
Medium (Ав)! (AsBi) |(АзВ») | (АзВз) | (AsBs) | (A9) 
Б] 87 68 151 259 
Short (А) | (AsBr) |(А4В») | (А:Ва) | (AsBa) |, (A) 
40 332 888 290 1000 
Total (B1) (Bz) (Bs) (B4) N 


Expected frequencies will be calculated— 


72x 40 _ 


Ch вы СЯ -PXE 29 
СОЖ STEEP 53 
(Ap, y= ФО Аал EE = 243 
(жк CX E eec — 209 
b= зд. Um 94 
(вд а ея = 784 
dn ха. 280388 _ тов 
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= 68.4 


== 17:3 


—143.8 


—146.4 


—125.5 


10.4 


— 85.9 


(Aa) X (B4) _ 236290 
(AB) = Хе „== 
= (A) xo „ 433X 40 
(A Sa 1000 
_ (As) (В) _ 433382 
(A;B) = =). N P 1000 
(AQ X (Ba) _ 433X338 
(ABa Ne 1000 
(Ag) X (Ba) _ 433X290 
(АзВ,) = є zm 1000 
(Ay) X (Ba) = 259 40 _ 
(A,B) === Xa STOOL Т = 
(Ay) X (Вә) _ 259X382 _ 
(A,B)—— -N- 1000 
(А XB) - 259х338 _ 
(AB)=— 1000 
(Aq) X (B4) _ 259X290 _ 
(AB) eee * = — 1000 SESS; 
f & [—&8)| G—h» 
(A;B1) | 20 29 17.1 | 292.4 
(АВ) | 30| 28.9 6.1 37.21 
(Вз) | 20| 248 |— 48 18.49 
(A,B4)| 2| 2059 | —18.9 357.21 
(AsB1)| 14 9.4 4.6 21.16 
{АзВ») | 125 | 78.4 46.6 | 2171.56 
(AsBa)| 85| 79.8 5.2 27.04 
(А-В) | 12| 68.4 | —56.4 | 3180.96 
(AsBi)| 8| 17.8 | —14:8 | 20449 
(АзВз) | 140 | 148.8 | — 3.8 14.44 
(ABs) | 165 | 146.4 18.6 | 345.96 
(АзВа) | 125 | 125.5 | — 0.5 00.25 
(ABi)| 8| 104 | — 74 54.76 
(АВ) | 37| 859 | —48.9 | 2391.21 
(AyBs) | 68| 87.5 | —19.5 | 380.25 
(АВ) | 151| 752 75.8 | 5745.64 


81.5 


75.2 


(£—£)* 
fı 


100.83 
1.55 
0.76 

17.09 
2.25 
27.70 
0.34 
46.50 
11.82 : 
0.10 
2.96 
0.002 
5.26 
27.83 
4.34 
76.40 
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$=Х2=325.132 
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Square contingency or— X2— 325.132 


Coefficient of contingency—C — 4| X _ 
d NI XP 
S ELI ж; 


1000-325.13 


Mean square contingency—? (phi Square) 
X? 325.13 


=i = 1000 ^ 32513 
t nr ca qq 
another formula for calculating C— пф 
32513 | 
= =A 
1+-.32513 — ш 


X* can also be caleulated by the following formula 


a f n 
х ЕЕ} N 
II Method 


Coefficient of contingency can also be calculated by the 
following formula. 


Ш Method 


Another method of calculating coefficient of contingency is by 
means of the following formula :— 


Р—1 
Ж 

Where Р is obtained by— 

(a) Squaring each cell frequency, 

(b) Dividing it by total of its row, 

(c) Adding the quotients for a column 

(d) Dividing these sums by respective column totals. 

(e) Summation of these column quotient—P. 
Thus in the above illustration 


1 рОН и зр 1252 140 
=40 172 +236 +483 * 259 $ +332 + 256 + 433 

3T 20? 8 102 682ү 1 2 1m 
+555 } +338 172 "236 +433 +259 f +290 175 + 236 
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125 1512 
ss m f =14+89--34--42=1.29 


"EN * =. 29 
Р DS. 3 ES 


The above discussed coefficient of contingency has a draw- 
back that it never reaches 1. Its maximum value depends 
upon subdivision of two attributes. The maximum value will be 


| 


where t stands for number of subdivisions. 
maximum value will be 


In a 2X2 table C's 


2—1. 
02 
i T 
Similarly in a 3X3 table C's maximum value will be .816 
» » 4X4 5 "m » .866 
» 5.0249 » » » 895 


Tschuprow's Coefficient 


son's Coefficient of contingency does not reach 


As Karl Pear 
as suggested another 


the maximum limit of 1, Tschuprow В: 
| coefficient T. It is 


| z ЕЕ 
T-4 ü—owvG-1)—D 


where s—No. of rows 


t—No. of columns 
In the above illustration T will be 


ue 4 495? .245 
=N ques Gena)” ‘1553 
= 108 —.329 


Chi—Square (X?) and degrees of freedom 


In X? studies degrees of freedom are of very gréat 
теез of freedom, we mean the number of 


importance. By deg 
which we can determine ourselves 


cases in а contingency table, 
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aud the rests will be determined according to the total columns. 
Let us illustrate by means of an example. 


A a 
| 
В 12 18 30 
| (АВ) | (aB) (В) 
b | 13 7 20 
(Ab) | (ab) (b) 
25 25 50 
(А) (а) N 


We start with null hypothesis. Let us suppose that 
attributes A and B are independent of each other, and according 


to this supposition the value of (AB) =X KB) =X _15 


Having determined the value of (AB) other call frequencies will 
be determined with reference to totals as :— 


A a 
B 15 15 30 
(AB) | (aB) (B) 
b 10 10 20 
(Ab) | (ab) (b) 
25 25 50 


(A) (a) N 


Thus in a 22 table we have the option of determining 
one frequency, the rest three are automatically determined. 
This is called as one degree of freedom. "The degrees of freedom 
are determined by the following formula :— 


V==(c—1).(r—1) 
where V=the degrees of freedom 
C=The number of columns 
r—the number of rows. 
Hence in a 22 table the degree of freedom— 
V=(e—1) (r—1) =(2—1) (2—1) 
ча: 
In a 3X3 table= |. V—(3—1)(3—1)—2x(2—4 
In a 4x4 table—V— (4—1) (4—1) —3x(8—9. З 
In case data are not given in the form of a contingency 
or association table, but in the form of a series of individual 


————sáE 


YET жае 
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observation or a discrete series, the degrees of freedom are 
equal ta the number of values less—1. 


Chi-Square and levels of Significance 

Having calculated the value of X?, as explained above, 
we compare the value to the table value of the appropriate 
number of degrees of freedom at a certain level of significance. 
Generally 0.05 or 596 level of significance is taken into consi- 
deration. The appropriate value of X? for a given degree of 
freedom and at a particular level of significance is found out 
by reference to X? table, given at the end of this book. If 
the caleulated value of X? is higher than the value given in 
{һе table, it is considered significant and hypothesis is not 
justified. In other words there is association between attributes. 
If however, the caleulated value of X? is less than the value 
given in the table, it is regarded as insignificant and it is 
therefore due to fluctuations of sampling. The hypothesis is 
then justified. 


In the above example the value of X? is—325.132 
the degrees of freedom 
V=(e—1) (r—1) —(4—1) (4—1) 
—8х8—=9 
with 9 degrees of freedom the value of X? at 5% level of 
significance is 16.919. The caleulated value is higher than this. 
Hence hypothesis is not justified and there is marked association 
between two attributes. 


Illustration—21 

Certain statistics published by Government, were analysed 
to find the preference of the Government for certain digits. А 
sample of 200 digits was taken at random and it was found to 
have the following frequencies. 


Digits Frequency 
18 


оо -ї су сл > юное 
t2 
x 
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How far do you agree with the hypothesis that digits are 
equally distributed in the statistics published by the Government? 
Use X? test to arrive at your conclusion. 


У= (0—1) (r—1)—(2—1) (10—1) —1x(9—9 


The expected frequency in each case is ee =20 
assuming equal distribution. 
Digits ORE ШОН SSOP ATA IG ouv ug vd 
Frequency 18 191287721 16 25 22 ^20 21° 15 


Expected frequency 20 20 20 20 20 20 20 20 20 20 


(18—20)? , (19—20)? (28—20)? (21—20)? (16—20)? 
РАН ASETA 
ee 20 e 20 je 20, +. Өл 20 py 20 


( 25. 20) 


22—20)? | (20— 21—20)? , (15—20)? 
о а 5-29» 


39 ii 20 _ 20 20 


—44149-+41416425441 041495 
= 2o —43 


The 5% value of X? for 9 degrees of freedom is 16.919. 
The calculated value of X? is less than this figure, hence 
hypothesis is correct, 


Illustration—22 


The following table shows the number of people interviewed 
by age-groups and the number in each age group estimated to 
have peptic ulcers. 


сыы са ДЕ puc e sy 
Age group [15—20/|20—25 25—35/35—45|45—855 |55—65|65—75| Total 


Nos. Inter- 
viewed 199 300 |1128 |1375 |1089 625 155 |4871 


Do these figures justify the hypothesis that peptic ulcer 
is equally popular in all age groups ? 


If peptic ulcer was equally popular in all age groups then 


б / 816 
in each age group, 1871 


100 Jor 6.5% of the people should 
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suffer from it. On this basis the observed and expected 
frequencies would be as follows :— 


Age group |15—2020—25 25—35 35 45|45—55|55—65|65—75 


Observed | 1 | в | 38 | 96 | 105 56 | 12 
Cases | 

Expected 18 | 19.5 | 78 89 | 71 | 405 | 10 
Cases | 

(1—13)2 , (8—19.5)? (88—73)? , (96—89)? 


e 


qd. m ae TE 

(105—71)? (56—40.5)2 (19—10)5 
ta 495 5^7 36 PER 
There are six degrees of freedom in the question. For six 
degrees of freedom the table value of X? at 5% level of 
significance is 12.59. The caleulated value is much higher than 
this and as such the difference is significant and the hypothesis 
is not justified. 


Illustration—23 


In an experiment 9 coins were tossed 512 times and the 
number of heads were observed as follows :— 
No. of heads 0 H 2 3 4 5 6 7 8 9 
Frequencies il 2 1 50 154 110 100 92 1 1 
Do the frequencies given above confirn the hypothesis that 
the coin is unbiased ? 


Supposing that the coin is unbiased and the distribution 
of frequencies is normal, the binomial expansion of 512 (4-1)? 
will give the estimated frequencies as follows :— 

No. of heads 0 1 2 3 4 5 6 Т 8 9 

Expected f 1 9 36 84 126 126 84 36 9 1 

The frequency of 0 and 9 being less than 5, the 0 and 1 
classes, and 8 and 9 classes have to be merged. The observed 
and the theoretical frequencies will, then, be as follows :— 


Heads 0& 1 2 з 4 5 ет 8& 9 
Observed f 3 1 50 154 110 100 92 2 
Theoretical f 10 36 84 126 126 84 36 10 
pc ftt) 
fa 


Substituting the values, we get: 
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_ (3—10)? Б? (50—84)2 (154—126)? 


x 10 36 3 84 t 126 


(110—126)? , (100—84)? | (92.36)? | (2.10)? 
не го 


. 49. 1225 ,1156 , 784 3136 , 64. 
S10 75936 84 126 36 10 


=4.9+34.03--13.16--6.063--2.032-{-3.047--87.12--6.4 


256 , 256 
епа at 


—151.352. 


Degrees of freedom (8—1)(2—1)—7. 


We enter the X? table with 7 degrees of freedom. The 5% 
value of X? with 7 degrees of freedomi—14.07. The calculated 
value, wz. 157.352, is much more and it can be said that the 


coin is biased. 


Illustration 24 


Two sample of polls of votes for two candidates A and B 
for a publie office are taken, one from among residents of urban 
areas, and the other from residents of rural areas. The results 
are given below. Examine whether the nature of the area is 


related to voting preference in this election. 


Votes for 
4 B Total 
Area by 
Rural 620 380 1,000 
Urban 550 450 1,000 
Total 1,170 830 2,000 


i a 


(B. Com. Delhi) 
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/ Setting up the hypothesis that the nature of area is 
independent of voting preference, the expected frequencies would 


1000Х 1170 1000830 


Rural ^. 2000 2000 
==585 —415 


10001170 1000 830 


Urban 2000 | 2000 
=585 = 415 
| 


Total 1,170 


oa 2 yes 2 SEX LET 2 
37 х2._(620 585) q (380 415) , (550 585)* , (450 415) 


585 415 585 415 


1 1 1 H 
к)" [5 +585 +415 115 
The number of degrees of freedom 
—(2—1)(2—1)—1. 

The caleulated value of X? which is 10 is much greater than 
the table value of X? for 1 d.f. at 5% level of significance. We, 
therefore, conclude theat our hypothesis is wrong and the nature 
of area is related to voting preference. 

'Theoretical Questions 

i—How would you distinguish between ‘association’ and 
‘correlation’ as the terms are used in Statistics ? 

(M. А. Agra, M. Com. Alld.) 
2—Write a note on the use of coefficient of Association in 
analysing economie statistics. (M. Com. Agra) 


3—What do you understand by ‘Association of Attributes' ? 
How is its existence or non-existence determined ? What is 


partial association ? 
Practical Questions 


1—Out of 900 persons, 300 were literate and 400 had 
travelled beyond the limits of their district. 200 of the literates 
were among those who had travelled. Is there any relation 


between travelling and literacy ? 


|=10 approx. 
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(Ans. Q=-+0.6) (M. Com. Agra) 
2— Calculate the Coefficient of Association between 
intelligence in father and son from the following data :— 


Intelligent fathers with intelligent sons 248 
Intelligent fathers with dull sons 81 
Dull fathers with intelligent sons 92 
Dull fathers with dull sons 579 
(Ans. Q=-+0.98) (M. A. Alld.) 


3—Can vaccination be regarded as a preventive measure of 
small-pox from the data given below ? 


"Of 1482 persons in a locality exposed to small pox, 368 in 
all were attacked. Of 1482 persons 343 had been vaccinated and 
of there only 35 were attacked". (M. Com. Alld.) 


(Ans. Q=-+0.57) 
4—Caleulate the coefficient of Association between extra- 
vagance in fathers and sons from the following data :— 


Extravagant fathers with extravagant sons 327 
Extravagant fathers with miserly sons 545 
Miserly fathers with extravagant sons 741 
Miserly fathers with miserly sons 285 
(Ans. Q=—0.68) (M. A. Luck., Raj., Agra) 


5. Find out the coefficient of Association between the type 
of training and success in teaching from the following table :— 


Institution :— 
Successful Unsuccessful Total 
Teachers’ College 58 42 100 
University 49 51 100 
Total 107 93 200 
(Ans. Q=-+0.18) (M. A. Raj., Alld.) 


6—What do you understand by ‘Contingency’? In ап 
investigation into the Health and Nutrition of certain children 
(between the ages of one and five years) two groups of children 
were compared, one belonging to the well-to-do class, 125 in 
number, and the other belonging to the poor class, 124 in number. 
"The following results were obtained :— 


Poor children Well-to-do children 


(per cent) (per cent) 
Below normal weight A 75 23 
Above normal weight T 5 42 


Find the coefficient of association between the weight of the 
children and their parents' financial condition. 


(Ans. Q=-+0.929) (M. Com. Agra, Raj.) 
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f 7—The following table shows the distribution of the temper 
in pairs of sisters in an exhaustive school inquiry :— 


Finsr SISTER 


$ксохр Sister | Torat 

| Good-natured Sullen ` 

Good-natured 1,040 180 1,220 

Sullen | 160 120 280 
Total | 


1,200 300 1,500 


А race the association, if any, in the distribution of tempers 
in first and second sisters. (M. Com. Raj.) 
(Ans. Q=-++0.72) 


g—A census revealed the following figures of the blind and 
the insane in two age-groups in a certain population :— 


Age-group Age-group 
15—25 years Over 75 years 


Total population 2,70,000 1,60,200 
Number of blind 1,000 2,000 
Number of insane 6,000 1,000 
No. of insane among the blind 19 9 


Obtain a measure of association between blindness and 
insanity in each of the two age-groups. (M. Com. Alld.) 

(Ans. Qi=—0.07 and Q4—-—0.16. There is greater degree 
of disassociation in ‘over 75 years’ age group. 

9—An investigation was carried out to determine whether 
there is any association between the eye colour of parents and 
the eye-colour of children. The colours were noted in the case 
of a random sample of 1000 fathers and their eldest sons. In 
471 cases both fathers and sons had light eyes, in 230 cases both 
had dark eyes, in 148 cases the fathers were dark-eyed and the 
sons light-eyed, and in all ramaining cases the sons were dark- 
eyed and the fathers were light-eyed. 

Determine whether eye-colour in fathers and in sons is 
associated or independent. (M. A. Agra) 

(Ans. Q=-+0.65) 

10—In the study of aggregates having different attributes how 
would you determine whether the attributes are mutually 
independent or are associated in some way ? 

1660 candidates appeared for a competitive examination, 422 
were successful. 256 had attended a coaching class and of these 


150 came out successful. Estimate the utility of the coaching 
class. (M. Com. Agra) 


(Ans. 9=-+0.7) 
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11—(а) Write short note on the use of coefficient of Associa- 
tion in analysing economic statistics. 


(b) From the figures given in the following table, 
compare the association between literacy and un- 
employment in the rural and urban areas, and give 
reasons for the difference, if any :— 


Total Adult Males 25 lakhs 200 lakhs 
Literate Males 10 lakhs 40 lakhs 
Unemployed Males 5 lakhs 4 lakhs 
Literate and Unemployed Males 8 lakhs 4 lakhs 


(М. A. Eco. St. Delhi ; M. Com. Agra) 
(Ans. Q for Urban=-+0.47, Q for Rural=--0.356) 
12—The following are the number of boys observed with 


certain classes of defects amongst a number of school children. 
А denotes development defects, B nerve signs, C low nutrition. 


(ABC) = 149 (aBC) = 204 
(ABc) = 738 (aBc) = 1762 
(АБС) = 295 (ab) = 171 
(Abe) =1196 (abe) 21842 


Find the frequencies of the positive classes. 


(Ans. N=26287, (A)—2808, (В)=2853, (C)—749, (AB) 
=887, (BC)=353, (AC) —874) 


18—Given the following positive class frequencies find all the 
ultimate class frequencies. 


N 223713 (AB) =587 
(А) = 1618 (ВС) =428 
(B) 2015 (АС) =885 
(С) РО (ABC) =156 


(Ans. (ABc)=481, (AbC)—179, (АЪе)=852, (аВС)=272, 
(aBc)=1156, (abC)—163, (abc)—20504) 


14—At an examination at which 600 candidates appeared, 
boys outnumbered girls by 16 per cent. Also those passing the 
examination exceed in number those failing by 310. The number 
of successful boys chosing Science subjects was 300 while among 
the girls offering Arts subjects there were 25 failures. Altogether 
only 185 offered Arts and 33 among them failed. Boys failing 
in the examination numbered 18. Obtain all the class frequencies. 

(Ans. (А)=348, (2)—252, (B)—455 (b)—145, (C)—4065. 
(с)=135, (AB)=880, (AC)—810, (BC)=858, (Ab)—18. 
(Ас) —38, (Bc)—102, (аВ)=125, (aC)=155, (bC)—112, (ab) 
—127, (ac)—97, (bc)—33, (ABC)=800, (AbC)=10, (aBC)— 
53, (abC)—102, (АВе)=80, (aBc)—72, (Abc)—8, (abc)=25, 
N=600) 


ee eo ыйба аа чойын. 
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15—From the following, find whether blindness and baldness 
are associated :— 


Total population —1,62,64,000 
Number of bald headed = 24,441 
Number of blind = 7,628 
Number of bald headed blind =з 291 


(M. Se. Agra) 


(Ans. There is positive association because expected fre- 
quencies of (AB) are less than the actual) 


16—The following table gives the numbers of literates and 
criminals in three cities of U.P. :— 


Kanpur Allahabad Agra 


Total Number (in thousands) 244 184 230 
Literates (in thousands) 40 47 38 
Literate Criminals (in thousands) 3 2 2 
Illiterate Criminals (in thousands) 40 20 24 


Compare the degree of association between criminality and 
iliteracy in each of the three towns. (M. A. Alld.) 


(Ans. Qs. Kanpur—--.45, Alld. +.55 and Agra=+.34) 

i7— The male population of U.P. is 250 lakhs. The 
number of literate males is 20 lakhs, and total number of male 
criminals is 26 thousond. The number of literate male criminals 
is 2 thousand. Do you find any association between literacy and 


criminality ? (M. A. Agra) 
(Ans. Q=—.02, negative association between literacy and 
criminality) 


18—Show how to form conditions of consistence of statistical 
returns for three attributes. 


The following are the proportions per 10,000 of boys. observed 
with certain classes of defects among a number of school children. 


A=development defects 
B=Nerve signs 
C=Mental dullness. 


N =10000 (C)=789 
(А) 877 (АВ)=338 
(В)= 1086 (ВС)=455 


Show that some dull boys do not exhibit development defects 
and find how many atleast do not do so. Taking the smallest 
number of such boys find the ultimate class frequencies. 

А (B. Com. Madras) 

(Ans. Atleast 117 dull boys do not exhibit development 
defects. The data are incomplete for finding the ultimate class 
frequencies) 


510 AN INTRODUCTION TO MODERN STATISTICS 


19—In a certain investigation carried with regard to 500 
graduates and 1500 non-graduates, it was found that the number 
of employed graduates was 450, while the number of unemployed 
non-graduates was 300. In the second investigation 5000 cases 
were examined. The number of non-graduates was 3000 and the 
number of employed non-graduates was 2500. 'Тһе number of 
graduates who were found to be employed was 1600. 


Caleulate the coefficient of association between graduation 
and employment in both the investigations. 


Can any definite conclusion be drawn from the coefficients ? 
(M. A. Agra) 


(Ans. Q. (I Investigation) —--.38, Q. (II Investigation) 


=—.1 


20—The following summary appears in a report on a survey 
covering 1,000 fields. Find out if the data are consistent. 


Manured fields — .. a n: .. 510 
Irrigated fields — .. m 24 .. 490 
Fields growing improved varieties is .. 427 
Fields both irrigated and manured We cay 139 
Fields both manured and growing improved varieties .. 140 
Fields both irrigated and growing improved varieties .. 85 


(M. Com., Allahabad and I.A.S.) 
(Ans. There is no inconsistency) 


21-—The following are the proportions per 5,000 of workers 
observed for certain classes of defects amongst a number of 
factory workers :— 


A=Development defect 
B=Nerve signs 
C=Mental dullness 


N =5,000 (C)=400 
(A)= 440 (AB)=170 
(В)= 545 (BC)—228 


Show that some dull workers do not exhibit development 
defects and state how many at least do not do so. 
(M. Com., Allahabad) 


(Ans. Atleast 58 boys do not exhibit development defects) 

22-—Among the adult population of a certain town 50 per 
cent. of the population are males, 60 per cent. wage earners, and 
50 per cent. are 50 years of age or over. 10 per cent. of the males 
are not wage earners and 40 per cent. of the males are under 50. 
Can we infer anything about what percentage of the population 
of 50 years or over are wage earners ? (M. Com., Allahabad) 


(Ans. The percentage of wage earning population of 50 
years or over must lie between 25 and 45) 
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23—The table given below shows the data obtained during 
an epidemic of cholera :— 


Attacked Not Attacked Total 

Inoculated 81 469 500 
Not Inoculated 185 1,815 1,500 
216 1,784 2,000 


Test the effectiveness of inoculation in preventing the attack 
of cholera. 


[Five per cent value of X? for one degree of freedom is 
8.84]. (T. A. &А. 5.) 


(Ans. X?—14.64— Not justified) 


24—The fortune Magazine of О. S. A. published the follow- 
ing results of a sample survey of public opinion regarding 
election of Roosevelt as the President of U. S. A. :— 


Attitude towards election Rich Poor Total 
Favourable 508 1559 2067 
Unfavourable 905 1114 2019 
Total 1413 2673 4086 


Is attitude towards election issue guided by the economic 
status of the voters ? What test would you apply ? The follow- 
_ ing table of 1% values of Chi-square is reproduced for your use :— 


Degrees of freedom : 1 2 3 4 
1% value of Chi-Square : 6.685 9.210 11.841 13.277 
(M. A. Patna) 


(Ans. X2=187.1, Yes) 


25—In the course of anti-malarial work in Birnagore, in 
the third quarter of 1932, quinine was administered to 606 adults 
out of a total population of 3,540. The incidence of malarial 
fever is shown below. Discuss the preventive value of quinine. 


Fever No fever Total 
Quinine 19 587 606 
No quinine 193 2,741 2,934 
Total 212 8,828 8,540 


You may use the 5% value of chi-square for m degrees of 
freedom, equal to 1, the value being 3.841. (M.A., Calcutta) 
(Ans. X 2=10.59—Not independent) 
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26—The following table gives the results of a series of 
controlled experiments. Discuss whether the treatment may be 
considered to have any positive effect :— 


Possitive Not effect Negative 
1 


Treatment 9 2 d 
Control 3 6 3 
12 8 4—24 E - 
(Т.А. & А.З В 


(Ans. X?=9.367 associated) 


27—There were 200 students in a college, whose results = 
in the 1st Terminal, 2nd Terminal and the Annual Examination — 
were as follows :— 


80 passed the Ist Terminal examination 
75 i5 » 2nd E 
96 »  » Annual > 

25 » all the three, 46 failed in all the three 4 
29 passed the first two, and failed in the Annual examination. —— 
42 failed in the first two but passed the Annual examination. | 
Find how many students passed atleast two examinations ? | 


(M. Com. Agra) 
(Ans. =83) 


28—750 candidates appeared and 470 passed at an examina- 
tion, 465 had attended classes and 58 of them failed. Prove the 
utility of the classes. (M. Com. Agra) 

29—If (A)—50, (B)—60, (C)—50, (Ab)—5, (Ac)=20, 
and N=100, find the greatest and least possible values of (BC). 

(M. Sc. Agra) 


30—(а) Given the following frequencies of the positive 
classes find the frequencies of the ‘ultimate classes : 


(A)=40, (B)—60, (AB)=30, N=130. 


(b) Examine whether A and B are independent in the $ 
following case : " 


(A)=490, (AB)=294, (а) 560, (aB)=380. 


E 


(B. Sc. Agra) 
(Ans. (a)—(Ab)—80, (ab)—60, (AB)=30. 

(b)—A and B are dependent) 
31—Find the value of chi-square for the following table. 


Class— A B С D E 
Observed Frequency 8 29 44 15 4 
Theoretical Frequency 7 94 38 24 Too RN 

M. (M. Sc. Agra) 


(Ans. Х2=6.75) 


ASSOCIATION OF ATTRIBUTES AND CONTINGENCY 513 


32—In the contingency table, given below, use X? test to 
test for independence of hair-colour and eye-colour of persons ? 


Hair 
colour 


Eye 
colour 


Blue 


Brown 


Total 


(Ans. The calculated value of X? is=15 which is much 
table value which is 3.84 so that over hypothesis 


greater than 
herefore conclude that hair colour and eye colour 


is wrong. Wet 
are associated) 


38 У 


CHAPTER 18 | 


ANALYSIS OF TIME SERIES 


“The analysis of time series has developed in the main as а 
result of investigations into the mature and causes of those 
fluctuations in economic activity called trade cycles. Economic 
theory has suggested various explanations of trade cycles. Analysis 
of time series has attempted to test the plausibility or otherwise 
of these theories. At the same time, such analysis may suggest _ 
new hypotheses for economic theorists to work on.” 


Р. Н. KARMEL 


А series of observations recorded over time is called а time 
Series. Numerical data which have been recorded at intervals 
of time form a time series. Most economie data are recorded 
over time eg. annual production, National income etc. 
Economic investigations are very largely dependent on the use 
of data arranged in time series. Suppose, for example, We 
wish to investigate the relation between the price of tea and 
the quantity demanded. Ideally, we should take an economy, set 
the price of tea at 1 Rupee per lb, observe the quantity 
demanded, vary the price, observe again and so.on. This is 
the experimental method available to physical scientists working 
in the laboratory. Unfortunately the economist can seldom 
utilise this method. His data at best consist of actually recorded 
prices and quantities demanded at various times ie. а time 
series of prices and a parallel one of quantities purchased. The 
demand curve for tea if it is to be estimated, must somehow 
be drawn out of these data. A time series may be defined as, 
«а sequence of values of some variable according to successive — 
points in time. 

The primary purpose in the statistical analysis of 
series is to discover and measure any regularities which 
characterise the movement of the data through time. In other . 


a time 
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words analysis of a time series consists of discovering, measuring 
and isolating any regular or persistent movements present in 
the series. The main objective in analysing time series is to 
understand, interpret and evaluate changes in economic pheno- 
mena in the hope of more correctly anticipating the course of 
future events. By studying the past pattern of changes, 
future course of events is forecasted. 


In time series, the following characteristics would be 
observed :— 


I Long Period Movements 
Secular Trend or ‘Trend’ 
II Short Period Movements 
A—Regular or Periodic movements 
(1) Seasonal changes 
(2) Cyclical changes 
B—Irregular or Erratic or Spasmodic movements, 


A time series may not be effected by all these types of 
movements. Some of these types of movements may effect a 
few time series while some other series may be effected by 
all these. Therefore in analysing time series effects of various 
types of movements on a series are isolated. 


LONG PERIOD MOVEMENTS 


Secular Trend. The term ‘secular trend’ is taken to mean 
the general long-term movements of the series. According 
to Werner Z. Hirsch, “By trend, sometimes also called secular 
trend, we mean the long run gradual growth or decline in a 
series which is an expression of such fundamental forces as 
population growth, improvements in know—how and productivity, 
increases in the supply of capital equipment and changes in the 
consumption habits.” Some phenomena have an upward and 
others a downward trend. For instance in most countries there 
exists a pronounced downward trend in the length of work 
week and upward trend in population and- agricultural and 
industrial production. Not all secular changes take place at a 
constant pace. Despite temporary deviations from the course, 
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both large and small fluctuations, there will be a clearly marked 
tendency in a given direction. 


Trends шау be computed to describe the secular movement 
of (1) an industry (2) a single enterprise and (3) the economy 
as a whole. Each of these presents a separate and unique 
problem of statistical and theoretical analysis. 


The most popular form of knowing trend is in the form of 
graph. Secular trend can be either linear or non linear. 
Linear trend is one which gives a straight line when plotted on 
‚а graph paper. The straight line given by such trend can be 
either (1) Arithmetic Straight line, where the average amount 
sf growth or decline is constant, or (2) Geometrie Straight line, 
where the increase or decrease is by a uniform proportionate 
rate. On ordinary graph paper, data with geometric progression 
wil give a curve but when plotted on a semi-logarithmic 
paper they will form a straight line. Growth curves of most 
economic phenomena take the form of Logistic Curve often 
called аз ‘S’ curve. When trend of the whole economy is found 
out the curve will be in S shape. 


NON-LINEAR TREND 
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sis of 


time series we use methods developed by curious combination 
of mathematics and economic analysis. The following are the 


principal method 
(1) Freehand Method 
(2) Method of Averages 


s of estimating the secular movements :— 
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(i) Selected Points 
(ii) Semi Average 
(iii) Moving Averages. 

(3) Method of Least Square 
(i) Arithmetie Straight line 
(ii) Logarithmie Straight line 
(iii) Parabolic Curve. 


Freehand Method. When the freehand method of estimat- 
ing the secular movement in time series is used the original 
data are first plotted on a graph paper, then a smooth line is 
drawn through the plotted points, which in the Statistician’s 
judgement accurately ‘describes the secular movement. As an 
aid in establishing the line, flexible rulers are sometimes used 
or a piece of string is laid on the flat surface of the chart and 
adjusted as the characteristics of the line are decided upon. 
It is wrong to think that it is easy and quick method of drawing 
а curve. On the otherhand it is a time consuming method and 
requires extreme care and conscientiousness. 

The main objection to this method is that it is necessarily 
rough, and result will depend very much on the judgement of 
the drawer of the line. No two statisticians will draw a 
similar trend. Further, there is no mathematical expression for 
such a line, so its properties cannot be described in the 
abbreviated language of mathematics. However a freehand 
trend drawn by a statistician with long experience in computing 
trends, who also is acquainted with the economic history and 
analysis, will give a better expression of the secular movements 
than a trend fitted by other methods. 


Method of Averages 

(1) Selected Points—According to this method two points 
considered to be most representative or normal are determined 
and joined. This method is not a satisfactory method because 
in the selection of points opinions may differ. What is 
considered by one as ‘normal year’ may be taken as abnormal 
by the other. 

(2) Semi-average Method—The semi-average method is 
employed when a straight line appears to be an adequate 
expression of trend. The data of the time series are divided 
‘into two equal parts and one summation is made of the values 
for the first half of the series and another for the second half. 
If there is an odd number of years the value of the middle year 
may be omitted. The sums are then divided by the number of 
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items included to obtain two arithmetic averages. Each average 
is then centered in the period of time from which it has been 
computed and plotted on the graph. А straight line may then 
be passed through the two points so located. This line constitutes 
the semi-average trend line. 

It is also possible to divide the series into more than two 
parts, compute an average for each part and after locating the 
points in the middle of their time periods, connect them with а 
series of straight lines. 


Year Production in million 

1948 12 q 

1944 13 | 

1945 16 | 90 Average—15 
1946 14 

1947 16 | 

1948 19 J 

1949 16 

1950 19 \ 

1951 22 | 

1952 n 182 Average=22 
1953 22 r en 
1954 25 ПИН 


1955 25 


Production in Millions 


1943 44 45 46 47 48 49 50 51 52 953 54 ,- 
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Obviously this method is simple but it has got certain 
limitations also. These are (1) The method of semi-averages 
assumes a straight line relationship between the plotted points. 
(2) There is no assurance that the influence of cycle is elimi- 
nated. This danger is greater when the time period represented 
by average is small. (3) This method has the defects which 
are found in arithmetic average, and (4) If arithmetic averages 
of the data are to be used in estimating the secular movement, 
it is sometimes better to use moving averages than semi-averages. 


Moving Averages Method. Like the semi-average method, 
this method also employs an arithmetic mean of items. But 
in tihs case there are as many averages as there are items in 
the series, except at the limits of the series where the averages 
cannot be computed. Moving averages are calculated for 
3, 4, 5, 7, or 9 yearly. Averages are taken from overlapping 
periods. This technique of using over-lapping periods simplifies 
the analysis by removing variations of a periodic type. If there 
is regularity in occurances of business eycles, the moving average 
trend will be a straight line otherwise it will be a curve. 


Mustration—1 


Calculate the five yearly moving average for the following 
time series and plot it with the original figures on the same 
graph. Next calculate seven yearly moving average and plot it 
on the same graph. Comment on the reversal effect, 


Year Annual figure|Year Annual figure|Year Annual figure 
1 110 11 130 21 146 
2 104 12 127 22 142 
8 98 13 122 28 188 
4 105 14 118 24 185 
5 109 15 180 25 145 
6 120 16 140 26 155 
7 115 17 185 27 150 
8 110 18 130 28 148 
9 114 19 127 29 143 

10 122 20 135 30 156 


(М. Com. В. Н. U.) 


——sáÀÀÀ—— ——— ÀÀÀ 
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Year |Annualfigure | 5 yearly | 5yearly | 7 yearly | 7 yearly 
Total M.A. Total М.А. 

i 110 

2 104 

8 98 526 105 

4 105 536 107 761 109 

5 109 547 109 761 109 

e 120 559 112 771 110 

7 115 568 114 795 114 

8 110 581 116 820 117 

9 114 591 118 888 120 
10 122 608 121 840 120 
11 180 DID W Ie yee 848 120 
12 127 619 124 868 128 
18 122 627 125 889 127 
14 118 637 127 902 129 
15 130 645 129 902 129 
16 140 653 181 902 129 
17 185 662 182 915 181 
18 180 667 185 948 185 
19 127 678 185 955 186 
20 135 680 136 953 136 
21 146 688 138 953 136 
22 142 696 139 968 138 
23 138 706 141 996 142 
24 185 715 148 1011 144 
25 145 723 145 1013 145 
26 155 733 147 1014 145 
27 150 741 148 1082 147 
28 148 752 150 
29 148 
30 156 


(In caleulating М.А. decimals have been ignored) 

The technique of moving average is used to eliminate the 
fluctuations and give only the general trend of the series. If a 
periodicity is noticed in occurance of cycle in the series, the 
moving average should cover one cycle period. This will 
eliminate nearly all regular and irregular fluctuations. 
Periodicity means the average duration of а cycle. 


The following principles should be noted in this 
connection :— 
(i) If the original data when plotted on a graph gives a 
straight line the moving average wil simply 
reproduce the original line. i 
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(ii) If the original series gives а curve which is concave, 
the moving average curve will be below it. 


(iii) If the original series gives a convex curve the moving 
average curve will be above it. 


(iv) In a series having regular fluctuations the moving 
average completely eliminates them, if the period 
selected for it coincides with the period when the 
fluctuations repeat themselves. 


(v) Erratie movements are never completely eliminated. 
They can be reduced and the greater the number of 


items in the average, the more is the reduction in the 
fluctuations. 


150 


130 


Orginal Data —— 
5 уг. МА.---- 
Tyr. М.А.-—*- 


NE Sete 9 1 13 15 17 19 2] 23 25 27 29 90 


The technique of moving average has certain limitations. 
According to Е.В. Macaulay, “The moving average values will not 
follow data which describe a curve unless elaborate weighting 
schemes are used." Another defect is that a moving average 


ANALYSIS OF TIME SERIES 523 


cannot be brought uptodate not extended back to the first years. 
Further the conditions necessary for its use are seldom met. It 
is sensitive to freakish movements in the data. It also does not 
result in a mathematical equation which may be used in 
forecasting. 


Notwithstanding these limitations, this technique is also 
becoming popular in the analysis of seasonal variations. 


Least Squares Method. 
(1) Arithmetie Straight line 


The 'Least Squaers' Method is given this name because its 
method of calculation gives a certain important mathematical 
property. А trend line computed by the method of least squares 
is such that the sum of the squares of the deviations of the 
observed values about it is a minimum. This is also called the 
line of the 'best fit. Time series data are generally inter- 
dependent, on the basis of this interdependence, a trend is 
computed. 


The equation for the straight line may be written as 
follows :— 


y=a+bx 

where y—the ordinates of trend 
x—unit of time. 

a and b are constants 


The values of constants a, and b are determined from two 
normal equations. The two equations are :— 


sy=Na+bsx 
sxy=ayx +b хх? 
Illustration—2 


Compute the trend for the following data using the method 
of least squares :— 


1955 83 
1956 92 
1957 11 
1958 90 


_ 1959 169 
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Year x | xa y xy | у” 
1955 qM 1 83 88. | G7 
1956 о: | 4 92 184 | Ba 
1957 8 | 9 71 213 | - 101 
1958 Ca ес 90 800 |. 118 
1959 5 | 35 169 845 | 185 
Total in 55 ТАШ ӨЫ: | Tikes | вов 

y=a-+bx 

Sy=Na+byx 


EXxy—axx-L-bxx? 
By substituting values we get 


505— 5a+-15b (1) 
1685—15à-L-55b (2) 
(—) 1515—15a-L45b (1)x3 
170—10b 
Li b 
By substituting value of b 
5a-I-15b—505 
5a-I-255—505 
5a—505— 255 
а—50 


Now the equation becomes 


у=а-ЕЬх or y—50-L-17x 
Value of y when x—1— 67 


» » „=2= 84 
» ” yore 10) 
» nu yes LS 
” ma n=O PO 


Values of a and b can also be calculated by the following” 
formulae. 


pay EX By _1685—3>505 _ 170 
7 3X^— x3x 55-8 15 10 
a=y —bx —101—173x(3—101—51—50 
Considerable calculation work will be reduced if trend is 
fitted by the following method based on simple calculations. 
Under this method. 


S 


(1) Find simple arithmetic mean of the items. 


(2) Calculate the deviations of each year from the middle: 
item of the series. t 


| 
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(3) Square these deviations. 
(4) Values of a and b will be caleulated as 
3y 


а===— 


N 
i (Rate of growth.) 
х2 
(5) Place the value of “а' at the mid point of the trend. By 


multiplying the respective deviations with ‘b’ trend values of 
other items will be found out. 


Year x = | yvy Trend | 
1955 | —2 4 s3 | —166 вт | +17X—2 
1956 | —1 1 92 |— 92 84 | H17X—1 
1957 0 0 71 0 101 Value of а 
1958 | +1 1 90 90 | 118 | +17X+1 
1959 | +2 4 169 338 | 135 |:+17X+2 
10 505 | 110 | 505 
Ке 905 1 
N 5 
sxy 170 
ишу тШ 
xx2 "10 


Difficulty may be encountered when the series consists of 
even number of years. In that case the middle year falls in 
between two years. From mid-year, the two years will be .5 each 
way. This may be illustrated by following example. 


Trend 


Year 

1955 59.5 
1956 ves 
1957 104.7 
1959 127.8 
1995 149.9 
1960 172.5 

ES P wee Ett 

Total 


Logarithmic Straight Line. ‘The straight-line arithmetic 
trend is used when the time series is found to be increasing or 
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decreasing by equal absolute amounts each year. The logarithmic 
straight line is used as an expression of the secular movement 
when the series is increasing or decreasing by a constant 
percentage rather than a constant absolute amount. Such 
tendency is found in many economic and business data. The 
equation of this curve is— 
Log y—Log a+Log b хог y—ab 

Normal equations are— 

Z(Log y) —N Log a-|-Log bx (x) 

3(x. Log y)—Log ax (x) --Log bx (x)? 


Log a= ч 
z(x. Log y) 
L cum crede АС 
og b= Sas) 


Plotted on a logarithmic graph, the curve will be a straight 
line, 

If middle year is taken as origin then Normal equations 
will be. 


sLog у=М Гора 
х (x. Log y) —Log by (x?) 
Logs are to be converted into actual numbers to arrive 
at natural numbers. 


Illustration —8 
Fit a logarithmic straight line trend of the following data :— 


Year| y x |Logy | х? | x Log y | Logy’ А.Т. of 

Log y 
1949| 30| —6| 1.4771| 86 | —8.8626 1.4290 26.9 
1950| 81| —5| 1.4914 25| —7.4570 1.4871 80.7 
1951| 32| —4| 1.5052) 16 — 6.0208 1.5451 85.1 
1952, 41| —3 | 1.6128 9 | —4.8384 1.6032 40.1 
1958| 47 | —2 | 1.6721 4| —8.8442 1.6612 45.8 
1954| 51| —1 | 1.7076 1| —1.7076 1.7193 52.4 
1955| 58 0| 1.7243 0 0 1.7774 59.9 
1956] 68! --1| 1.8325 1 1.8325 1.8354 68.5 
1957| 80| +2) 1.9031 4 3.8062 1.8985 78.3 
1958) 87| 3 | 1.9395 9 5.8185 1.9515 | 89.4 
1959 | 104 | +4 | 2.0170 16 8.0680 2.0096 | 102.2 
1960| 116 | +5 | 2.0645| 25 10.3225 2.0677 | 116.9 
1961| 144 | +6] 2.1584 36| 12.9504 | 2.1257 М 

| 884 [23.1055 182 10.5675 
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Гору 23.1055 


Log a= > = 
g a N 18 =1.7774 
=x. Logy 10.5675 
Log b= —————-- ES 
y з ПОЗ 20058 
Procedure— 


1. Find the time deviation of each year from the middle 
year. (x) 


2. Square up these deviations (x?) 


3. Convert the original data into logarithms (Log y), and 
multiply with x (x Log y). 


L 
4. Find out Log se E d , and keep the value so obtained 


before the middle year. 


1 Іо 
5. Find out the value of growth by Log b= and 


value to be multiplied with each deviation and after adding or 
subtracting as is required, keep the value opposite to that year 
(Log y’). 

6. Convert Log y’ into natural numbers. 

Parabolic Curve. The straight line trend equation belongs 
to a family of simple polynomials. The second and third degree 
parabolas provide greater flexibility. 

The 2nd degree parabola—The formula for the second degree 
parabola may be written— 

y—a-J-bxJ-ex? © 

The values of a, b, and с constants are obtained through the 

solutions of three normal equations. 


The normal equations are :— 


sy=Na -pbzx--ezx? 
sx ycaxx -Lhxx?--czxe? 
$xy—azx?-bzx*--ezx 


When the middle value of the time series is the origin, the 


equations will be 


zycNa-czEe 
xx y=b5 х? 
зх2у=ах рез х* 
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Substituting these values in the Normal Equations we get— 
430—11a --b(0)--110c 
641—2(0) --110b-1-c(0) 
4667=110а--Ъ(0)--1958е 
By solving these equations we get— 
a—34.8, b—5.8, c=0.43 

The equation of parabola y—a--bx--cx? becomes 

y—34.8+5.8x-+.43x? 

Therefore when 

х=—5, у—34.8—29--10.15=16.55 
x=—4, у=18.48 
and so on. 

The third degree Parabola—The third degree parabola is 
secured by the introduction of а fourth constant in the equation, 
so that it becomes— 

y—a--bx--cx?--dx? 

The introduction of the fourth factor ‘g’ permits the trend 
computed by this equation to change direction twice. The use 
of third degree parabola makes the trend line more sensitive and 
fexible than the straight line or second degree parabola. The 
normal equations for this parabola are— 

sy—Na Jbzx -cxx?--d 3x? 
ух y—axx |х x2-L-c yx?--d xxi 
xxiycazx?-brx x*--exxi--dsx 
xixy-azxLbzx*J-ex x5-L.dxx$ 
When middle year is taken as origin, then 
у7—Ма-ехх” 
ух y—bz*-dxxt 
xxiiy—axzx?--exx* 
axty=bsxt+dyxt 
Growth Curves 


Two curves commonly referred to as growth curves are 


sometimes used in economic analysis. They are the Gompertz 
curve and the Logistic curve. The formula for Gompertz curve is 
y=ab ĉin the logarithmic form this is 
Log y—log a+ (log b)e* 
The form of Logistic curve is 
y-ca pen 
These curves are similar in many respects. 
1. They both take the form of an elongated S curve. 
2. They do not assume negative values. 


3. They approximate zero at one limit and are asymptotic 


to a certain value at the other limit. 
34 
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4. The rate of growth is subject to retardation. 

5. They show small absolute growth in early years, then 
rapid growth which gradually tapers off as retardation sets in. 
SHORT TIME FLUCTUATIONS 

Short time oscillations may be regular or irregular. 


Regular or Periodic movements. Nature and custom are 
responsible for the pronounced and regular seasonal pattern of 
many economic phenomena. Periodic variations are the 
recurrent pattern of change within the period that results from 
the operation of forces connected with climate or custom at 
different times of the period. Periodic movements may be of 
two kinds :— 


1—Seasonal wariation—These variations are associated with 
seasonal changes. Many economic time series are subject to this 
type of movement. Seasonal variation is evident when the data 
are recorded at weekly or monthly or quarterly intervals. 
Although the amplitude of seasonal variations may vary, their 
period is fixed being one year. Аз a result, seasonal variations 
do not appear in series of annual figures. 

2—Cyclical variations—The oscillatory movement upward 
and downward of a series of data that results from alternating 
levels of economie activity is often referred to by the economist 
as the “Business Cycle.” By its very nature it has four 
successive stages—Inflation—recession—depression and recovery. 
These four stages are essentially present in any cycle. The 
amplitude and the period of the cycles may not be very regular, 
but in many series which reflect economic activity in one 
way or the other, a cycle with a period of some eight or nine years 
is not uncommon. There may be minor cycles also for about 
3 years duration. 


Irregular Variations. Irregular or erratic variations may 
also be of two kinds. There is first, the strictly random or 
chance movements. They turn the series first one way and then 
another in a purely chance manner. Secondly certain isolated or 
irregular, but powerful, movements crop up from time to time— 
like strike, political upheaval or a war. These may be called 
episodic movements. 

In the measurement of short time oscillations, the trend 
values are isolated from the original data. If from the values 
of a time series, the trend values are subtracted, the remainder 
will be short time oscillations. 
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Short time fluctuations from Moving Averages. 


Illustration—5 


Analyse the 


short-time fluctuations 


in the following 


recordings of temperatures (in Fahrenheit) for the first 20 days 
of December, 1958 : 


1858 Temperature 


Date 1958 Temperature Date 

December 1 40 December 11 
2 50 12 

8 44 13 

4 70 14 

5 52 15 

6 44 16 

7 86 T 

8 40 18 

9 56 19 

10 68 20 


78 


As the data refers to the first twenty days of the month, we 
will have to calculate а seven-day's moving average to ascertain 


the trend. 
т 
Seven days 
Date Sevendays | moving | Shorttime 
1958, Temperature | moving average |oscillations 
December total (approx.) 
1 40 = a = 
2 50 = NS = 
3 44 22 = = 
4 70 336 48 4-99 
5 52 336 48 + 4 
6 44 342 49 2279 
vf 86 866 52 —16 
8 40 374 53 —13 
9 56 402 57 — 1 
10 68 418 60 + 8 
11 78 446 64 +14 
12 80. 468 67 4-18 
13 60 480 69 — 9 
14 64 498 71 — 7 
15 62 516 74 —12 
16 68 530 76 — 8 
17 86 548 78 HE 
18 96 ER = 7i 
19 94 = == д 
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Short time oscillations from Trend (least squares) 


Year Original values Trend Short Time oscillation 


1952 83 67 +16 
_ 1958 92 84 +8 
1954 71 101 —30 
1955 90 118 —28 
1956 169 185 nmm 


Seasonal average and seasonal Index method. When tht 
is no trend or cyclical fiuctuations, the simple average may 
found out for each month or quarter as the case may be. 
calculating seasonal index averages for all months are to 
and an average of the averages will be calculated. This avem 
will be taken as equal to 100. We may also link the mont 
averages with this average taking it equal to 100. 


Illustration—6 


Obtain the average seasonal variations for the foll wir 
data of exports of raw jute :— 


(In 000 Tons) 


A че 


(v) | (vi) 


| 


Business Cycles їп the U. S. A. and England arranged in 
chronological order (1796-1923) have had the following duration 
as measured to the nearest year :— 


U.S.A. : 6, 6, 5, 3, 7, 3, 3, 5 4,3, 6, 1, 2, 6, 4, 3, 5, 5, 4, 
9, 5, 3, 2, 3, 4, 3, 4, 2, 3, 5? 3. 

England : 4, 6, 4, 3, 5, 4, 6, 4, 2, 6, 10, 7, 4, 8, 8, 9, 8, 10, 

1, 6, 5, 2. 


Tabulate the above figures in classes of one year each and 
caleulate the average duration of the business cycle in each 


country separately. (B. Com. Lucknow) 
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Solution. 
U. S. A. England 

Y Frequency Frequency 

Duration Frequency х Duration |Frequency X 
in years Duration | in years Duration 

1 1 1 1 0 0 

2 4 8 2 2 4 

3 10 30 8 1 8 

4 5 20 4 5 20 

5 6 80 5 2 10 

6 4 24 6 4 24 

7 1 7 7 2 14 

8 0 0 8 Б] 24 

9 1 9 9 1 9 

10 0 0 10 2 20 

Total 32 129 Total 22 | 128 


Average duration of the business cycle in the U. S. A. 


129 


Average duration of the business cycle in England 


LLL —4.08 years 
32 У 
128 582 years. 

22 


Complete analysis. Complete analysis of a time series may 
isolate regular and irregular variations. 
of a time series is done as shown below. 


Illustration—8 


The complete analysis 


Using the data given below explain clearly how you would 
determine the seasonal fluctuations in a time series :— 


Year 


1 


oO > о bo 


Summer Monsoon 
80 81 
33 104 
42 158 
56 172 
67 201 


Autumn Winter 


119 
171 
221 
285 
302 
(М. Сот, Vikram) 


MA Senec" 
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Ex Рә Dre СА 
Е 3 Е 85 2 я 
TENERE Y 35s де | 38 | 33 
“Ж 2| $4 380] BERE HE В 
$8 |4 Cs $8 Qog ER- a5 3 
но |> aH во «m 2 еф лж “= 
First | У. 
5 ї| 30 
M. Ee 
м nire] я 
m 587 78 —11] —194/ 4-8 
Ww. IV! 119 613 77 442 | 468 | —26 
Second 318 
S. T| 38| 842 660 83 —50 | —75 | +25 
M. II| 104| 394 734 | 92 432 | 425 | —18 
A. ПІ) 86| 408 797 | 100 —14 | —19 | + 5 
W. IV| 171 855 107 4-6% | +68 | — 4 
Third 452 
S. I| 42| 465 917 115 —73 | —75 | 4-2 
M. П| 153| 515 980 123 4-80 | 428 | +5 
A. III| 99| 529 1044 181 —32 —19 | —13 
W. IV} 221 1077 135 4-86 | +68 | +18 
Fourth 548 
S. Essar 578 1126 141 —85 | —75 | —10 
M. II|172| 592 1170 146 +26 +25 +1 
А. III| 129 | 603 | 1195 149 —95| —19 | — 1 
W. IV| 235 1235 154 +81 +68 | +18 
Fifth 632 
E a 67 | 639 1271 159 —92 | —75 —17 
А 201 | 706 | 1345 168 33 25 8 
А. ПТ 136 А 8 F 
W. ТУ 302 
Years Summer Monsoon Autumn Winter 
First — — —11 +42 
Second —50 +12 —14 +64 
Third —78 --30 —82 --86 
Fourth —85 +26 —20 +81 
Fifth —92 +33 — — 
Total | —300 +101 —77 +1278 
Average — 75 + 25 —19 + 68 


Method of Link relatives. Method of link relatives is also 
employed to find out seasonal variations. It is illustrated by 
the following example. 


Illustration—9 
Apply the link relative method of obtaining a measure of 


seasonal variation to the following data of imports. 
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Imports into India 
(in lakh rupees) 


i Quarters 
Years I II III IV 
, 1952 283 258 244 260 
1953 210 208 204 241 
1954 194 168 159 188 
1955 159 162 168 189 
1956 184 179 176 197 
1957 179 182 182 219 
1958 200 204 207 248 


Computation of link relatives : 
Each quarterly figure is divided by that of the preceding, 
quarter and the result is expressed as a percentage. 
Current season's figure 100 


Previous season's figure 


Link Relatives 


T II III IV I 
1952 — 91.17 94.57 | 106.56 — 
1953 80.76 99.05 98.07 | 118.14 — 
1954 80.49 86.59 94.64 | 115.09 FI 
1955 86.88 101.88 103.70 | 112.50 — 
1956 97.35 97.28 98.32 | 111.93 — 
1957 90.86 102.79 100.00 | 120.33 — 
1958 91.32 102.00 101.47 | 117.39 — 
Total : 527.66 680.76 690.77 | 801.94 — 
А. М. : 87.94 97.25 98.68 | 114.56 — 
Chain Relatives : 100.00 97.25 95.97 | 109.94 | 96.68 
Adjusted C. R. : 100.00 98.08 97.63 | 112.43 | 100.00 
Seasonal Index : 98.10 96.10 95.60 | 110.20 == 


Procedure : 


(i) Each arithmetic mean link relative indicates the average 
relation of each quarter to the preceding quarter. 


(ii) Construction of Chain Relatives 


The frst quarters С. В. —100 

The second ,,  ,, ,, =the second quarter link relative 

The third quarter С. R.—(3rd quarter L. R.<2nn quar- 
ter С.В.) /100 = (98.68 x 97.25) /100—95.97 


—€——— 
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Simitarly, 


The fourth quarter C. R.— (4th quarter L. R.X3rd quar- 
ter С. В.) /100=(114.56 95.97) /100—109.94 


The new first quarter C. R.— (1st quarter link relative 
4th quarter C. В.) /100= (87.94 109.94) /100 
—96.68 (put in the sixth column of the table) 


(iii) Adjusted С. Ёз: 
Corrections to be applied : 


68—100 
(New iat quarter Cim quo cesi ООО 


4 
n Ti —— 0.83 
Thus, adjusted C. R. for 1st quarter—100 
a 1:0 ОЙДА = 91.254-(0.83) — 98.08 
» M ipe d es i ata chs = 95.97--2(.83)= 97.63 
» ЕЯ —=109.94--8(.88)—112.43 
(iv) Seasonal Indices 
408.14 
The average of the adjusted C. R= y -=—102.035 
Adjusted C. R. 
e 100 
xou ia ene of the adjusted C. i) х 


Thus, Seasonal Index for the first quarter 
100 
= (5035 
Seasonal Index for the 2nd quarter 
98.08 


Жа аг, =96.1 
02.085 } x10 


Ж =98.1 


and so on. 


Utility of Time Series Analysis to a Business man and to ат 
Economist. The utility of time series analysis may be studied’ 
under following heads :— 


1. Planning for the future is facilitated—One of the most 
difficult problems confronting any organisation is that of 
planning for future. Planning depends upon estimation of future 
needs. One of the statistical measures used in making such 
estimates is the secular trend. If enough data are at hand, the 
assumption that the annual rate of growth observed in the past 
will continue for a time in the future is usually reasonable and 
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valid. Since trend also measures cyclical changes, programmes 
can, also be adjusted according to the expected phase of a Cycle. 
For example, if prosperity is expected, future programmes will 
be undertaken at a rapid pace. 


2. Time Series Analysis and determination of the statistical 
position of an enterprize—Time series analysis enables the 
excutive to determine the statistical position of the organisation 
which he directs. Enterpriser may compare the actual behaviour 
of his enterprize against the normal behaviour of the economy 
as a whole. 


8. Time Series Analysis and Controls—The essence of any 
control system is the setting up a standard according to the 
objective of the organisation and establishment of a systematic 
plan for checking actual performance against this standard. One 
of the most important results of time series analysis for 
managerial control is the measure of seasonal variations. 1f it 
is known in advance that a typical seasonal variation will take 
place, production programmes will be adjusted accordingly. 

4. Time Series Analysis and the Reduction of undesirable 
variations—It has long been recognised that seasonal oscillations 
in business activity are a great source of waste, largely because 
of idle plant during off peak season. If the amount and nature 
of seasonal variation can be measured, the seriousness of the 
problem will be reduced. Time series analysis is also helpful in 
reducing the cyclical variation. Cycle curves indicate that the 
market for durable goods is much more sensitive to changes in 
demand than is the market for non durable goods, and the 
consumers' non durable goods are less influenced by cyclical forces 
than are markets for producers' non durable goods. Within 
limits a management may take its business out of boom or bust. 


Time series analysis makes a further contribution to the 
reduction of undesirable fluctuations by aiding economists and 
executives in-forming reasoned judgement as to the statistical 
position of the economy or business—that is whether it is in a 
recovery, prosperity, recession or depression phase. On the 
basis of such judgements decisions are reached relating to new 
construction, credit extension, production and sales policies. 
Time series analysis may help the enterprise to escape the worst 
consequences of the cyclical changes. 


5. Time Series Analysis and Economic Analysis—Economi¢ 
statisticians have identified the growth element in the economy 


ANALYSIS OF TIME SERIES 539 


as a whole and in particular industries. They have examined the 
cyclieal patterns and stated hypotheses both as to the periodicity 
of their occurrence and the forces which produce them. Short 
cycles, intermediate cycles trend cycles and longer cycles have 
been found by different investigators. By time series analysis 
they have examined economic behaviour. 


'Theoretical Questions 


1—Write a brief essay on ‘Analysis of Time Series’. 
(M. A. Raj.) 


_ 2—What is meant by ‘Trend’ ? How would you statistically 
eliminate the influence of seasonal and cyclical fluctuations on 
the long period movement of any series ? (M. A. Raj.) 


3—What do you understand by an ‘Economic Time Series’ ? 
Why and how do you decompose such a series into various 
components such as trend, seasonal variations, business cycle 
etc. ? How is trend in a given time series studied ? Explain 
giving illustration. (M. A. B. H. U.) 


4-—What is а ‘trend’ in a time series ? Describe briefly the 
methods known to you for determining it in a time series. 
(M. Com. B. H. U.) 


5—Explain the various kinds of fluctuations in a time series 


and show their significance by taking a few examples. 
(B. Com. Alld.) 


6—What is the meaning and importance of analysis of time 
series data ? Enumerate the methods of finding the trend that 
you know ? 

7—(a) Distinguish between regular and irregular fluctuations 
in a time series. 


(b) Write a short note on the values of analysing time 
variations. (M. A. Punjab) 


8—Discuss the claims and limitations of the method of 


moving averages as applied to analysis of time series. 
Я (М. А. Eco. St. Delhi) 


9. Describe briefly the statistical procedure you would 
adopt for the analysis of time series and explain carefully how 
you would isolate the secular trend. (М. Sc. Ag. Agra) 

10. The analysis of time series consists of the description 
and measurement of various changes or movements as they appear 
in the series during the period of time. 

Classify these changes or movements. Mention the different 


methods used for measuring trend and explain fully any one 
of them. (M. Com. Agra) 
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Practical Questions 


. 1—Вергеѕепі the following data graphically. Also show the —— 
8-yearly and 5-yearly moving averages to indicate the trend :— 4 


Үеаг Birth Year Birth Year Birth 


Rate Rate Rate 
1917. .. 809 1924 .. 31.0 1981 .. 281 
1918. .. 3803 1925 .. 29.0 1982 .. 238.7 
1919... 29.1 1926 .. 27.9 1938 .. 22.6 
1920 |... 814 _ 1927 .. 27.7 1934 .. 28.6 
1921 .. 88.4 1928 .. 26.4 1985 .. 28.0 
1922. .. 80.2 1929... 247 1936 .. 22.0 
1923  .. 30.4 1980 .. 241 1937 .. 22.6 
38 22.9 
(M. “A. АП. Jey 
2—Fit a straight line trend to the following data :— 
Year| 1951 1952 1953 | 1954 1955 | 1956 1957. 
y | 25 | 20 | $5 | so | 45 | 30 4.0 


(Ans. y/—2.40, 2.67, 2.94, 3.21, 3.48, 3.75, 4.02) 


8— PERCENTAGE UNEMPLOYED AMONG Insurep PERSONS 
(Average for the year) 
Year Males Females Year Males Females 


1941 16.1 8.7 1949 16.4 14.4 
1942 124.4 59.0 1950 22.4 17.7 
1943 10.8 8.5 1951 25.1 18.5 
1944 12.0 8.1 1952 28.1 11.2 
1945 18.2 9.5 1958 19.1 9.8 
1946 10.9 6.2 1954 17.8 9.8 
1947 12.2 6.7 1955 14.6 8.8 
1948 11.5 7.2 1956 11.8 7 


Represent on one graph paper the series for males and that 
for females. Smooth the series for males by a five-yearly moving 


average. 
(M. A., B. H. 9.) 


4—Below are given the figures of production (in thousand 
Maunds) of a sugar factory :— 


Year Production in 
thousand maunds 
1941 80 
1942 90 
1943 = 92 
1944 83 
1945 94 
1946 99 


1947 92 
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(a) Find the slope of a straight line trend to these fi 
ofa res. 
(b) Plot these figures on a graph and show the trend Hd. 
(c) Do these figures show a rising trend or a falling trend ? 
How do you arrive at your conclusion ? 
(M. Com. Luck.) 


(Ans. (a) The slope of the straight line is 2. 
(b) The trend values are, 84, 86, 88, 90, 92, 94, 96. 
(c) Rising trend.) 


5—Explain how would you deal with a time series, and 
illustrate your remarks with the help af the following series of 
annual figures for the period 1901-1930 ;— 


Period Annual values 
1901—1910 .. 208, 223, 225, 222, 239, 242, 288, 252, 257, 250 
1911—1920 .. 273, 270, 268, 288, 284, 282, 300, 303, 298, 818 
1921—1930 .. 317,809, 829, 333, 397, 845, 844, 348, 362, 360 


(1.С.5.) 


(Ans. Tt is required to show the method of finding the trend 
and the short time oscillations in the series. If we assume a 
five yearly trade cycle, the trend values and short time oscillations 
will be— 
Trend— 223, 230, 233, 289, 246, 248, 254, 260, 264 
270, 277, 278, 284, 291, 293, 299, 306, 308 
313, 320, 323, 329, 334, 338, 344, 351. 
Short time oscillations—+2, —8, +6, 4-3. —8, +4, +8, —10, 
+9, 0, —9, +10, 0, —9, +7, +4, —8, 
-L5, ра, —11, +6, +4, —7, +7, 0, ed) 
6—The index numbers of annual production of a commodity 
(1900—100) are given below :— 


Year Annual Average Year Annual Average 


1927 165 1989 280 
1928 178 1940 - 351 
1929 236 1941 320 
1930 213 1942 370 
1931 180 1943 325 
1932 168 1944 866 
1938 180 1845 . 256 
1984 187 1946 804 
1985 210 1947 291 
1986 237 1948 277 
1937 203 1949 274 
1938 215 1950 272 


Plot them. Assuming а ten-yearly cycle, find the trend- 
values by the method of moving averages. (M.A.. Allahabad) 
(Ans. The trend values are— 196.80. 200.50, 204.60, 213.70, 
227.60, 244.95, 262.55, 278.75, 290.00, 295.65, 303.40, 310.90, 


313.70, 309.45) 
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7—The revenue from sales tax in U. P. during 1948-49 to 
1952-53 is shown in the following table. Fit a straight line 
trend by the method of least squares and exhibit the data as also 
the trend on a graph paper :— 


Years Revenue 
(Rs. in Lakh) 
1948—49 427 
1949—50 612 
1950—51 521 
1951—52 495 
1952—58 490 
(B. Сот. Alld.) 


(Ans. The trend values are :—507.2, 508.1, 509.0, 509.9, 
510.8) 


8—Calculate the five yearly moving average of acres under 
tea in India from the following data :— 

Plot on a squared paper (1) the annual area under tea and 
(2) its five yearly moving average. 

State other methods of finding the secular trend. 


Year Area in (000 acres) 
1925 672 

1926 679 

1927 690 

1928 ~ 702 ! 
1929 712 

1930 802 

1931 807 

1932 809 

1933 816 

1934 


891 [ 
(B. Com. Luck.) 


(Ans. The trend values are—691.0, 717.0, 742.6, 766.4, 
789.2, 811.0) 


9—Analyse the following series :— 


Quarters I II III IV 
1957 128 115 122 149 
1958 112 155 156 188 
1959 141 178 162 223 
1960 173 214 215 235 


10—Estimate the influence of trend, seasonal and random 
variations on the following data. 


Quarters I II III IV 
1957 78 62 56 fil 
1958 84 64 61 82 
1959 92 70 63 85 


1960 100 81 72 96 
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11—Р1о& the following data on a graph paper. Calculate 5 
yearly moving average and show the trend on the same paper. 


Years Index Years Index 
1940 105 1946 85 
1941 115 1947 75 
1942 100 1948 60 
1948 90 1949 65 
1944 80 1950 70 
1945 95 1951 55 


(B. Com. Gujrat) 


(Ans. Moving averages—98, 96, 90, 85, 79, 76, 71, 65) 
19—Fit a straight line trend by the method of least squares 
to the growth of the Reserves of Co-operative Societies in India as 
given below : 
Year Reserves Year Reserves 


(Lakhs of Rs.) (Lakhs of Rs.) 
1927—28 612 1931—32 1001 
1928—29 719 1932—33 1106 
1929—30 820 1933—34 1231 
1930—31 907 
(M.A. Punjab) 


(Ans. Trend: 612.5, 712.9, 813.8, 919.7, 1014.1, 1114.5, 
1214.9) 


СНАРТЕЕ 17 


SAMPLING 


"Whenever a large sample of chaotie elements are taken in 
hand and maxshalled in the order o their magnitude, an unsuspected 
and most beautiful form of regularity proves to have been latent 
all along.” 

Sir Francis Garron 


Meaning and Nature, Our knowledge, our attitudes and 
our actions are based to a very large extent upon samples. This 
is equally true in every day life and in scientific research. In 
our day-to-day life we adopt the sampling technique almost 
every moment of our existence. We go to the market, examine 
a sample of wheat or rice, form idea about the quality, and 
decide whether the quality is acceptable or not. We examine 
a small portion of shirt-cloth, and form an idea about the 
quality of the entire length. We meet a person for a short while, 
and form opinion about his character and personality. A 
traveller who spends a few days in a foreign country, and then 
Proceeds to write a book about that country, or suggests the 
government of that country about Planning and Taxation 
reforms. All this is adopting the sampling technique. The 
sampling procedure is based on the assumption that the 
cireulating blood is always well mixed, and that one drop tells 
the same story as another, 


Sample Enquiry. In planning a statistical enquiry it is 
to be determined as to whether the investigation is to take into 
account the whole population or only part of it. When it is 
concerned with the whole population it is called the census 
enumeration or census enquiry ; and when only a part of the 
population is taken inta account it is called the sample enumera- 
tion or sample enquiry. The selection of a part of an aggregate 
to represent the whole is a long established practice. 
Statistical methodology has placed this technique on a scientific 


| 
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basis. In statisties, a sample is а part or sub-set of observations 
taken from the population and sampling is the procedure by 
which information is obtained from only а part of the population. 
Under the sampling method 'a small group chosen at random 
from a large group' is selected and necessary investigation is 
made about this group. Sampling is based upon the principle— 
organised knowledge is representative in character. The 
individual observation included in the sample is a sample unit 
or 'su' In some statistical enquiries sampling is the only 
possible method and in others sampling is the best method and 
in some others it may be a better method. There are certain 
principal advantages of sampling as compared with complete 
enumeration. They are 


(i) Reduced cost—If data are secured from only a small 
fraction of the aggregate, expenditure may be expected to be 
smaller than if complete census is attempted. 


(ii) Greater speed—For the above reason, the data can be 
collected and summarised more quickly with a sample than with 
a complete count. This may be a vital consideration when the 
information is urgently required. 

(iii) Greater scope—In certain types of enquiry, highly 
trained personnel or specialised equipments, limited in avail- 
ability must be used to obtain the data. А complete census 
may then be impracticable. The choice lies between obtaining 
the information by sampling or not at all. Thus surveys which 
rely on sampling have more scope and flexibility as to the types 
of information that can be obtained. 


(iv) Greater Accuracy—Because personnel of higher quality 
can be employed and can be given intensive training, a sample 
may actually produce more accurate results than the kind of 
complete enumeration that it is feasible to take. 


Parameters and Statistic 

A descriptive measure of the population is called ‘Parameter’ , 

and a descriptive measure pertaining to a sample is called 
*Statistic'. 

Types of Universe. Population is also known as ‘Universe’. 

A universe can be either ‘finite’ or ‘Infinite’. By finite universe 


we mean such population which contain a definite number of 
units such as number of students etc. An infinite universe 13 


35 
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E 
one in which the number of units is infinite. Thus the length = 
of leaves, heights of students are infinite universe. "Though it 
is possible to measure the length of the leaves or heights of - 

students, but the actual measurement will vary within certain = 
limits. à A 


A ‘Universe’ may also be ‘hypothetical or ‘existent. 
Hypothetical universe is one which does not consist of concrete 
objects e.g. we can throw a dice innumerable times. Hence we 
can form a universe by throwing the dice a large number of 
times and recording its results. Existent universe refers to a 
population of concrete objects like the number of students or 
the number of books. 


Objects of Sampling. In general there are two objects of 
sample-study. (1) To use the sample information .to test 
hypotheses about the parent population from which it was 
drawn or (2) to make inferences about the nature of that parent 
population. The characteristics of parent population can be 
found out by a sample study, which will save money, time and 
energy. The main object of sample studies is to obtain the .- 
best possible values of the parameters. 


Essentials of a good Sampling. (i) The securing of a 
representative sample is the first essential of the method of 
statistical induction or inference. The sample taken should be 
Such as to possess the same characteristics as those possessed 
by the original universe from which it has been taken ‘out. 
It is possible only when each case of universe has a chance of 
being included in the sample. If two samples from the same 
universe have been taken out, they should be similar. 


(ii) In a simple sampling, the individual items composing it 
Should be independent of each other. 


(iii) There should be no essential difference between two 
qualities from which items have been selected and also between 
the period of time covered by each instance. 


(iv) The regulating conditions should be the same for 
every individual instance in the sample. 


Sampling Techniques 


Selection of a sample should be done in such a manner, D. 
that the sample taken is the representative of the population. : 
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At least, the variables under investigation should be present in 
the sample in the same manner as in the population. 


There are two ways of taking a sample. One is called the 
purposive or deliberate or judgement sampling, and the other 
is random or chance or probability sampling. 


1—Purposive Sampling—There is no special technique for 
selecting a purposive sample. The statistician exercises his 
individual judgement in selecting it. Consequently the sample 
will vary from one investigator to another, and there is consi- 
derable scope for bias affecting selection of the sample. 
Purposive selection may produce good results when the sample 
is small or the selecting expert is so expert that a statistical 
study was not needed anyway. Under purposive selection of 
sample ‘chance’ is not allowed to play freely. Some control or 
judgement is applied in selection. This method of taking a 
sample has following defects :— 


(1) The selection of sample-items may be due to individual 
bias. Hence sample may not be a true representative of the 


` parameters. 


(2) As selection of items is not subject to chance or 
probability, it is difficult to calculate correct sampling errors. 

(3) Sample estimates have no guarantee of accuracy. Due 
to these defects the method of purposive sampling is falling 
into disuse more and more. 


(4) In selecting items inclination becomes more important 
than judgement. 


2 Random Sampling— Random' as used in statistics is а 
technical word ; it has a meaning different from the one given 
in popular usage. When a sample is called *random', this 
describes not the data in the sample, but the process by which 
sample was obtained. Thus randomness is a property not of 
an individual sample but of the process of sampling. Random 
sampling does not mean hap-hazard selection or selection of 
sample units as they occur to the investigator. According to 
Dr. F. Yates, a random sample is one in which “every member 
of the parent population has had an equal chance of being 
included.” The individual is invariably biased in selection and 
is doubtless unconscious of the fact. To eliminate the possibility 
of human prejudices interfering in the selection of a represen- 


548 AN INTRODUCTION TO MODERN STATISTICS 


tative sample, the method of random selection has been devised. 
Under this method chance alone is allowed to determine which 
items from the population are to be selected. Every member 
of the universe has an equal chance of being included in the 
sample. There is no room for discriminatation. In fact random 
sampling is a scientific way of getting a sample from some 
universe. This method is also known as ‘unrestricted sampling’ 
device, because of the fact that sample items are selected fron 
the whole population free from all restrictions. In the words of 
Lillian Cohen, “А sample is considered a simple random one 
if its members are drawn in such a way that each observation 
of the universe has an equal chance of being included in the 
sample, and every possible combination of observations in the 
universe has the same chance of being included." 


Several devices have been adopted for random selection of 
the sample units. One is the ‘pack of card’ method. The items 
in the universe are given certain numbers, and also cards in a 
pack, ог more than one pack if necessary, are allotted corres- 
ponding numbers. The cards are shuffled and then a number 
of cards is drawn equal to the number of sample units to be 
selected. The individuals in the Population with numbers 


corresponding to the numbers on the cards drawn constitute the 
sample units, 


The second method of selection is drawing of lots. The 
sample units are selected by the lottery. 


These first and second methods are being abandoned these 
days because (i) these methods are not practicable when large 
samples are to be drawn say a sample of 4000 items. (ii)Inspite 
of the best possible care, it is not possible to ensure that under 
these methods each item has got an equal chance of being 
selected. The material of the cards will not be absolutely 
uniform or similar, so that while shuffling, the pack will have 
a tendency to be shuffled in a particular manner, and that will 
introduce bias. In case of the lottery, the pieces of paper on 
which the items are indicated will not be uniform. Inside 
surface of the churning bowl will also not be perfectly uniform. 
So some of the beads will be more prone to come out, when 
lots are drawn, than others, and this will introduce bias. 


The third method of which investigators are taking more 
and more recourse fox selection of samples is known as ‘random 
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sampling numbers’. Certain numbers have been arranged, 4 
digits at а time, in columns and rows by L. Н. C. Tippet. He 
selected 41,600 digits from census reports and combined them 
by fours so as to give 10,400 four-figure numbers. These 
numbers are called “Tippet’s random numbers". 


These numbers are of the following type :— 
Random Sampling Numbers 


1254 2858 7858 4024 3684 8485 2617 5488 
5443 4911 0922 7184 4798 1811 8701 2210 
82:2 2892 4112 9877 4776 4512 1746 2598 
7809 0297 8956 2158 7780 0753 1232 7181. ' 
6862 4194 3596 5072 4478 8099 0729 4950 
9179 3814 9153 2197 6746 9646 8105 8188 
5317 0986 0633 6480 4834 8710 8829 8572 
0126 4777 8034 9217 2128 2232 5039 8687 
2372 7774 9446 7178 8408 3971 0899. 5274 
0357 5276 3999 0261 9255 5780 5728 0032 
7855 09707 6259 4268: 9878 4918 0987 9118 
2510 4254 1548 0224 0112 6523 8687 4707 
6639 1918 3120 9149 6145 5895 0726 3883 
6769 1485 9107 4762 9902 8764 7388 2729 ' 
4527 8000 8648 3366 7945 4847 4317 9636 
5609 9883 2486 0893 4132 6668 0799 6137 
4130 1445 2887 0724 1294 8988 1527 1467 
4506 2474 3590 8308 7640 7128. 1023 2418 
4645 0618 9846 4458 5666 7671 1184 2328 
6685 3544 9828 9187 0506 6473 5356 8940 
1983 8999 1017 7268 7699 4151 8132 7271 
9944 0845 7468 3936 9002 0857 5784 4480 
0330 9918 4990 7790 6932 0871 1988 9881 
9903 7914 4133 6826 0230 1337 7413 8840 
1614 7862 9500 4109 1037 2978 6075 0971 
1596 3069 7906. 6656 5298 5090 8580 6756 
4951 6998 5243” 6375 5088 2078 8389 0822 
9867 0399 6741 9579 7559 4171 1364 9890 
2670 7060 2909 8705 0188 9808 7206 8104 
5808 4390 66081 9852 9458 0863 8491 1166 
8298 1195 9153 9517 1481 0879 2355 2615 
3117 0939 0770 8084 9978 9120 0967 0334 
3109 1169 0151 3869 3360 
0039 4461 8674 5296 8111 5748 7451 2141 
9400 4971 0464 1849 7055 3093 6860 7777 
5080 6749 8832 2760 5220 3344 5704 6859 
0996 0851 09650 7107 ` 0121 5224 1225 4881 
6909 4971 4087 4783 9604 8725 4088 4028 
7341 5030 8157 0130 S818 5682 5231 6721 
4891 6223 9252 0259 2174 0304 0848 6429 
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1036 9133 5407 1734 5114 3940 1052 6210 
9140 7896 0881 7064 5862 2466 9764 7913 
1281 8613 3431 0104 0689 3742 4788 9887 
9426 3819 5761 8019 3637 9217 1250 1852 
9156 4867 9269 6149 4471 8007 5156 2117 
2881 7885 1188 1705 2254 4336 4869 0668 
7924 5505 0432 3101 0995 2741 1570 2507 
1980 8420 7225 8292 2557 8286 9977 4910 
0987 2467 1820 3694 7650 2971 1982 3115 
7558 5563 2431 0144 0099 9578 9302 8922 
These numbers have been put to all possible tests and they 
have been found to be really random. With the help of these 
numbers, the work of selecting random samples has become 
very simple. What one has to do is to take any table of random 
numbers and start. using the table from any position either 
horizontally or vertically. But once having started, not a single 
number should be left out and the order should not also be 
disturbed. For example if the population consists of 5000 items 
and a sample of 15 items is to be taken. We can start from 
the beginning and select following numbers :— 


1254, 2858, 4024, 3684, 2617, 4911, 922, 4798, 1311, 2210, 
3262, 2322, 4112, 4776 and 4512. The numbers higher than 
5000 are ignored. 


The merits of random Sampling are :— 


(1) It is more scientific method of taking out a sample 
from a universe, There are less chances for individual bias. 
Every item in the universe has the chance of being selected. 


(2) Almost any sampling method will have some pattern 
of variability, but random samples are the only ones that have 
а known pattern of variability. The pattern of sampling 
variability for any universe is known only if sampling is random. 


р (3) Randomness is important in statisties primarily because 
if а sample is random, the theory of probability is applicable. 


(4) The greater is the number of items in the sample 
selected at random, greater will be. the chances of sample 
possessing the characteristics of the universe, as Law of 


Inertia of large numbers and Law of Statistical Regularity begin 
to operate. 


Demerits of random sampling—(1) Random selection of 
samples is often more costly than non-random selection. 
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(2) There may be occasions when only very few items are 
{о be included in the sample. For example if an intensive study 
of cities is to be made, and only 4 or 5 cities are to be covered, 
then random sampling will not be possible. 


(3) When only certain data are accessible, random. sampling 
is not possible. 


3—Systematie or Quasi-random Sampling—According to 
this method a list of the universe is prepared on some basis. 
The basis may be alphabetical, geographical, numerical or some 
other order. The first item is selected at random. Then every 
n th item (5th or 10th or any other number) is taken. Such 
sampling is a systematic one. For example, if the list of a uni- 
verse comprises 25,000 items and the sample required is 500, then 
the selection of every fiftieth item will yield the required 
sample. The starting point is determined by selecting at random 
a number between 1 and 50. 


Strictly speaking systematic or quasi random sampling is 
not truly random. This is because, once the initial starting 
point has been determined, it follows that the remainder of 
the items selected for the sample are pre-determined by the 
constant interval. If there is any periodicity in the list and 
a particular type of unit occurs at the appropriate interval, 
then it is feared that such type will be over-represented in the 
sample. 


4 Cluster Sampling—Under this method random selection 
is made out of groups of items. For example manufacturing 
concerns differ in regard to nature of product, capital invested, 
number of employees and in many other ways. Similarly 
people may differ in respect to sex, age, race, occupation, 
religion and so forth. Such differences are important and 
are kept in mind when a sample is selected. Hence random 
selection is made out of the groups, and each item of the 
universe has no chance of being selected. 


5—Startified Sampling—When a population is heterogeneous, 
or in other words, different segments or starta exist in the 
population, then it is startified. We startify the population by 
dividing it into starta so that each startum is more or less 
homogeneous, and make a random selection of sample from each 
startum. The division of the population into starta or groups 
is done according to some relevant characteristic. Each 
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startum is called sub-population. These sub-populations are non- 
overlapping and together they comprise the whole of the 
population. 


Startifieation is a very, common technique. Such technique 
is useful in the following cases :— 


(1) If data of known precision are wanted for certain 
Sub-divisions of the population, it is advisable to treat each 
division as a population in its own right. 


(2) Administrative convenience may dictate the use of 
startification e.g. the agency conducting the survey may have 
field offices, each of which can supervise the survey for a part 
of the population. 


(3) Sampling problems may differ markedly in different 
parts of the population. 


(4) Startification may bring about a gain in precision in 
the estimates of characteristics of the whole population. The 
basic idea is that, it may be possible to divide a heterogeneous 
population into sub-populations, each of which is internally 
homogeneous, 


Generally, the number of units selected from each startum 
is proportional to the number of units in that startum in the 
population. 


Startified sampling is not possible unless some information 
Concerning the population and its starta is available. An 
important point, which should be given due consideration is that 
the starta should be one Which are related to the topic being 
studied, If we are making a health study of students in a 
college, the starta may be according to their living at home or 
at hostel those who take regular exercise ог those who do 
not and so forth. Many publie opinion and market research 
organizations make use of the principle of startified sampling. 
For а non-homogeneous population a properly startified sample 
may yield more reliable results than a simple random sample 
of the same size, 


6—Sequential Sampling—Sequential sampling has been used 
most widely in connection with quality control schemes, but it 
is gradually being applied to other fields also. According to 
this method size of the sample is not fixed in advance. We 
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decide after each sample observation or group of observations 
whether to accept or reject or to reserve judgement and continue 
sampling until a decision is finally reached. Very often, the 
final decision is reached after fewer samples have been taken 
and observed. This sampling technique is closely associated with 
the name of Abraham Wald. 


7—Multi-Stage Sampling—According to this method the 
sample is prepared by stages. The universe is divided into a 
number of large sampling units, each of which in turn is 
divided into smaller units and so on. A random sample is taken 
of the large units at the first stage and from those selected a 
further random sample is taken. For example for a particular 
survey of India, we may select some states, then districts may be 
selected at random, from the districts cities may be picked up 
at random. For nation-wide surveys it is the only method 
which is administratively practicable. But this system has 
weakness also. Since random sampling errors cannot be avoided, 
they must accumulate at every stage, and sampling error will 
be larger in such method. 


8—Quota Sampling—To есопотіѕе in time and cost, 
American practice has provided a technique known as quota 
sampling. This method is common in making surveys of public 
opinion. Interviewers are given definite quotas of persons in 
different social classes, different age groups, different regions 
etc., and are then instructed to obtain the required number of 
interviews to fill each quota. The quotas ensure that the total 
sample includes approximately the right proportion of persons 
of the various categories. But the quotas are not filled by a 
random selection. Thus there are chances of bias on the part 
of interviewer. 


9—Area Sampling—This sampling technique is applied on 
a geographical basis. By the use of map references the entire 
area to be surveyed is broken down into smaller areas and a 
few of these areas are selected by random method. 


10—-Proportionate Sampling—When it is desired that 
different segments of the universe should be represented in 
tha sample according to their size, this technique is applied. If 
there are three sub-universe having a total of 1000, 2000 and 
4000 items each, then a sample must contain items from these 
starta in the proportion of 1: 2: 4. 
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11—Balanced Sampling—If a random sample is selected in 
such a manner that the average of the universe is the same as 
the average of the sample, then such a selected samplz is known 
as balanced sample. 


12— Line Sampling—This technique is helpful in agricultural 
surveys, when the sampling frame is not available. In this 
method of sampling, a point is fixed and a number of random 
lines are drawn in different directions through the point. Data 
is collected of all those items which lie along these lines and 
are included in the sample. 


13—Interpenetrating Sampling—This technique was pro- 
Posed in 1946 by Prof, Mahalanobis, who has used it in a number 
of Indian surveys. A simple random sample of ‘n’ units is 
divided at random into ‘k’ group of units, each group containing 


n 
к units. The field work in taking the sample is planned so 


that there is no correlation between the errors of measurement 
of any two units that are in different groups. 


14— Convenience Sampling—This is usually some off-hand 
easy way of selecting items which result in obtaining a chunk 
Of the population. A ‘Chunk’ is a convenient slice of a 
population which is commonly referred to as a sample. 


The Size of the Sample. Whichever type of sampling 
technique is used, the inevitable question arises as to the size 
of the sample to be taken. The size of the sample should be 
‘the largest practicable’. Every increase in the sample size brings 
with it some increase in the precision of the sample estimate. 
The question as to the appropriate size of a sample is deter- 
mined by the results required, 


The accuracy of a sample increases as the Square root of 
its size increases, When it is found that a sample is inadequate, 
then it is desired to increase its size. If there are 100 items 
in the sample and it is desired to increase the acuracy twofold 
then sample should be increased to №100; 10x«2—202—400. This 
will be the required number of sample items for the desired 
accuracy. 


Prof. A. R. Пегзїс gives a formula of getting a size of 
sample, which is 
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Standard error %= NIC 
n 


If it is desired that the sample may yield .5% standard 
error, and chances of favourable items are supposed to be 45% 
and that of unfavourable items are supposed to be 5576 then 
sample should be— 


Б NES 
n 


45.55 
25= x 
n 
.25n—2475 
n=9900 


Errors of Sampling. A sample is a substitute for a full 
count of the population from which it is drawn. The information 
derived from the sample represent the characteristics of the 
parameter. But we should not expect the sample mean to equal 
exactly the population mean. If we take repeated samples from 
the same population, the different sample means will differ 
amongst themselves, even though the population mean will 
remain the same. These discrepancies between sample statistics 
and population parameters are called ‘Sampling Errors’ or 
‘Fluctuations of Sampling’. These errors of sampling must be 
distinguished from the inaccuracies which arise in collecting 
data. These inaccuracies occur both in full counts and samples, 
and they are reduced in samples. Full counts do not by their 
nature contain sampling errors. 

Discrepancies between sample statistics and population 
parameters may be due to two reasons. Firstly, the method of 
selection of the sample may lead to bias in the sample in the 
sense that with repeated sampling the mean of a sample statistic 
(mean, standard deviation, etc) does not tend to the correspond- 
ing population parameter as the number of samples increases. 
Secondly even if bias is absent discrepancies will occur due to 
chance, because no sample will have all the characteristics of 
the population. The first source of discrepancies can be 
eliminated by the use of proper sampling technique. With the 
elimination of bias, the sampling errors which remain will be 
inevitable. They can be eliminated only by taking a full count, 
but they сап ofcourse be reduced by inereasing the size of the 
sample. The sampling error between a sample statistic and the 
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Corresponding population parameter for which it is to be used 
às ап estimate cannot ofcourse, be specified for any one 
particular sample. But the magnitude of the sampling error 
can be measured in the sense of the dispersion which sample 


values would have about the population value with repeated 
Sampling. 


Sampling Distribution. Even if nothing is known about a 
population, it is possible on the basis of а random sample from 
that population to get reliable information about its nature, 
This is possible because of our knowledge of ‘Sampling 
distributions’ of the various sample statistics. A knowledge of 
sampling distributions is not only helpful in estimation only 
but also in testing of hypotheses, If we select independently 
a large number of random samples of a definite size from a 


will be variations in them, but a majority of them would be 
found clustering round the mean of the universe. These means 
would form a frequency distribution. This will be true to every 
Sample statistic, means, standard deviation ete. If the 
population is normally distributed, the sample statistic will also 
form a Normal Curve. Hence Normal Curve is of fundamental 
importance to Sampling. Even if the distribution of universe 
is not normal, the means of the samples will still tend to be 
normal, provided samples are of adequate size. 


Standard Error, For any sampling distribution, mean 


and standard deviation can be calculated. The variation between 
large number of sample means 


distribution is known as the ‘standard error’, 


the precision of 


The Normal Curve is helpful in finding out the extent of 
the standard error. The maximum and minimum limits of 
variation are decided by Standard Error. 68:27% of the samples 
Will have their mean values within a range of the population 
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mean + loor standard error. Similarly 95.45% samples will 
have their mean values within a range of population mean + 2, 
and 99.73% will vary within the range of + 3 с. 


The range of + 3 с should be taken as the determining 
limit beyond which population mean should not fall. If it lies 
beyond this limit, then variation is not due to sampling error. 
Thus standard error is a guide for measuring precision of an 
estimate, The smaller the standard error in an estimate, the 
more precise and reliable it is. Sometimes the term ‘level of 
significance’ is used for finding out sampling error. For a 
particular level of confidence we have to find out the с in the 
table of level of significance. In the table 5% level of significance 
с =1.96 and at 2% level of significance  — 2.326. Sampling 
error is found out with reference to these values of ,. With 
5% level of significance we will be accurate in 95% of the cases, 
with 2% level of significance we will be accurate in 98% of 
the cases and with 195 level of significance we shall be accurate 
in 9995 of the cases. 


Theory of Sampling. The aim of sampling is to get from 
the sample an idea about the frequency distribution of the 
parent universe. With the help of standard statistical techni- 
ques, we try to find out the estimates of mean, standard deviation 
ete of the parent universe from those of the samples. A 
major portion of the theory of sampling is devoted to the 
measurement of the constants of the parent universe on the 
basis of sample-statistic. With samples of the sizes that are 
common in practice, there is often good reason to suppose that 
the sample estimates are approximately normally distributed. 
Consequantly the sampling variance of the estimate is used to 
provide in inverse terms a measure of its precision. A consider- 
able part of the theory deals with the calculation of formulas 
for the sampling variances. Another aim of the theory of 
sampling is to determine as to what degree of confidence can 
be placed in the estimates when they have been obtained or in 
other words what is the degree of precision of these estimates. 


The theory of sampling can be studied under two heads : 
(1) The Sampling of Attributes and (2) The Sampling of 
Variables. 


The Sampling of Attributes. In the sampling of attributes 
we are concerned with the possession or non-possession of some 
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attribute by an item selected in sampling. For example, in 
sampling from births we шау be concerned only whether the 
child is a male or female. The choosing of an individual in 
Sampling may be called an ‘event’ or a ‘trial’ and the possession 
of the attribute by an individual selected a ‘success’ and the 
non-possession a failure. 

Suppose, that we take N samples with n events is each. 
The chance of success of each event is 'p' and of the failure 
q=1—p., then two frequencies of Samples with 0, 1, 2, samples 
are the terms in the expansion. 


N(q--p)* 
The expected value or the mean (M) is equal to np, the 
variance npq and the Standard deviation of the number of 
Successes is \/пра. 


The standard error of the proportion of successes is 


2 NS 
a Кар 
п 
These two formulae of caleulating standard deviation are 


of very great importance in sampling. The following illustra- 
tion will make it clear. 


Ilustration—1 


А coin is tossed 1000 times and the head comes out 550 
times. Can the deviation from expected value be due to 
fluctuations of simple sampling ? 

Chance of getting a head, p=} 

Expected frequency—np—10004—500 

Standard deviation=\/npq_ 

—V3Xx1x1000—15.81 
The difference between observed frequency and expected 
frequeney—550- 500—850. 


Since the difference is greater than three times the 
Standard error (15.81 X3—47.43 <50) and so it cannot be 
accounted for by fluctuation of simple sampling. 


Illustration —2 


In some dice throwing experiments, A threw dice 49152 
times and of these 25145 yielded a 4, b or 6. Is this consonant 
with the hypothesis that the dice were unbiassed ? 
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The probability of getting 4, 5, or 6 with a dice is 3, hence 
the expected proportion of successes is p—3—. 
А . 25145 
The observed б f Boa 
e observed proportion oi success 15— 5152 5115 


The standard error of proportion is 


үк = NEXEX дууа ~V 90000508 :—.00225 
Difference between the observed and expected proportions 
of successes—.5115—.5=.0115 


3 times standard error of proportion—3 5.00225 
—.00675 


Since difference is greater than 3 times standard error, 
hence the deviation is not due to sampling fluctuations but 
the dice were biassed and not unbiassed. 

Precision. The standard error indicates the unreliability 
of the value of p. The greater the standard error, the greater 
are the variations of the observed proportion. The reciprocal 


of the standard error [ gives a measure of 


1 
e 
reliability, and it is called precision. The precision of an 
observed proportion varies as the square root of the number of 
Observations. If double the precision, which means reducing the 
standard error to one-half, the number of observations should 
be increased four times. 


Comparison of Large Samples :— There may be cases where 
two samples have been taken from different universes. Let 
two samples give proportion of A's as Pi and ps, the 
number in the two samples being п. and Np. We have to find 
if the difference p,—p; is significant of a real difference between 
two populations with respect to given attributes. On the 
hypothesis that populations are similar in this respect, we can 
combine the samples to give the common value of the proportion 
of A's in the population by the formula— 

__ри-ЕРәйә 
Ie (mns 

This is the best possible estimate of po that can be had in 

the given circumstances. The standard errors in two samples are 


S.e,— 2099 апа 8.е— АТ 
n Ng 
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On the hypothesis that p, is really equal to ps the standard 
error of the difference would be 


S. -9= NS seti 
Ny 02 


If р,—р›>3 s.e., then it is not due to fluctuations of the 
simple sampling, but it is due to some other reasons, 


Sometimes, it may be so that the proportions of A’s are 
not the same in the two materials or universes, from which the 
samples have been taken but p, and р» аге the true values of 
the proportions. The standard errors of sampling in the two 
cases аге 


в.еу— 4 [P141 S.e5— NUS 
n 


The standard error of the difference of P, and Р, would be 


Re png =щ Pidi Р24 
$. 1-2 n; Е n; 
If the difference p, —p,—3 s.e., it may disappear on taking 
fresh samples, because this difference may be due to the 
fluctuations of the simple sampling. 


Illustration—2A 


A machine puts cut 16 imperfect articles in a sample of 
500. After machine is overhauled, it puts out 3 imperfect 
articles in a batch of 100. Has the machine been improved ? 

(B. Com. Delhi) 


Pi=proportion of imperfect articles in the first sample— 


16 
ЕЕК 
500 s 
pa—proportion of imperfect articles in the second sample— 
3 
1og =:080. 


Difference between the proportions pi—p5—.002. On the 
-ssumption that the machine has not been improved we have 


TE рии -Ерьпь 
EE WE 
16 3 
=500X 500 -Fl00X155- 19 
500--100 ~~ 600 
19 581 


SAMPLING 561 


Standard error of the difference between the proportions 


— >) 19 2 DSL RH 1 19 _ 581 
P8 а A hel Sho Ses Ge сїй =4|—— x Ж.06=019, 
V 600 € 600 (sos 4-00 ) 600 X 600 < 2 

Difference between proportions is less than two times the 
standard error. Our hypothesis appears to be correct. 

.. We conclude that it is likely that machine has not been 
improved. 
Illusbration—8 

In a simple sample of 600 men from a certain large city, 
400 are found to be smokers. In one of 900 from another city 
450 are found to be smokers. Do the data indicate that cities are 
significantly different with respect to prevalence of smoking 
among men ? 


400 2 450 1 
P= soos 200 P= 9007 2 
А АЕ 
Th —pa—z- -5s =p == 
e difference p,—ps— ТЕ 166 


On the assumption that these two cities are alike in respect 
of smoking habit among men we get— 
p414-1- P225 
Ny-} Ny 
2 
=5-X600+3900 ү, 
6004-900 5580 


Bui 
d0— — a3. 30 


ро 


Then 


ИУ ТЕ ЙЕ БЕ 
бе. 1-2= [Poa n Tn 
VEL sre d Ee ea 
17 13 ( 1 £ ) TI 
EDE AREE EA EET ec ауда —.02 
25—90 (600 +906 |= V 000652 "s 
P,—P, is equal to .166 
The S.e. 1-9 X8=.026 X3=.078 


As difference is greater than 3X(&.e.,.5 our hypothesis is 
wrong i.e. the assumption that two cities are similar is wrong. 


36 
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Illustration—4 
In a certain association table the following frequencies were 
obtained :— 
(АВ) —309 (aB)—132 
(Ab) —214 (ab)—119 
Can the association of thd table have arisen as a fluctuation 
of simple sampling, the true association being zero. 


(А)=(АВ)--(АБ)=З09-1 214—528 
(В)=(АВ)-|-(аВ)=309-1 132—441 
(b) —(Ab)-L (ab) —214-L-119—333 | 
N —(B) + (b) =441-- 333—774 | 


As true association is zero, hence 


(АВ) (Ab) 
EB) Cb) 
Е ие ; EE (CB) 809: — 
The proportion of As' in Bs Emm 74 | 
Е Е (АЪ) 214 
ра ае 049 
The Ргорогііоп of As, in bs =P= (b) 333 6 
P,—P4,—.058 
pos ate (А), 2528. ет 


m+n М 774 
qo—1—.676—.324 


y 1 
S.€4-5— а Pons ] 


1 iU | 
333 "441. 

P,—P,—.058 (s.e. .034у3—.102) is less than 3 s.e. 1-2 
hence the association of the table has arisen due to fluctuation 
of the sampling. 


= 4/ -676X .324 =\/.001156 —.034 
V 


—— 


Illustration—-5 | 


In two large populations there are 35 and 30 per cent of fair 
haired people. Is the difference likely to be revealed by 
simple sample of 1500 and 1000 respectively from the tw 
populations ? E 


35 30 


p 109735 P.— 1003 


P,—P,—.85—.3—.05 


—  —— P 
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The variance of the difference of the proportinos in the 
sample is 

EZ Poq. 

Bees =q Pa A 

1-2 n pups 


NODUM Mega Е 


2: аа. зхл 
i500 + 1000 
=\/.000362 019 
As the difference (P,— P3) is .05 in the proportions is more 
than 2.6 times of s.e., hence it is unlikely that the real difference 
will be hidden. 


SAMPLING OF VARIABLES 


Let us consider the sampling of variables such as height, 
age ete. Each member of the population of individuals provides 
a value of the variables, and we thus have a population of values 
of the variables and the frequency distribution determined by it. 


Standard Error of the Mean. Standard error of the mean 
gives the range of deviation from the population mean within 
which means of infinite number of large samples would lie. The 
formula is 

S.E. of Mean— = 
Jn 

c—Standard Deviation of the population 
n—Number of items in the sample. 

In practice the mean and the standard deviation of the 
population are not known. We have to find out standard error 
of Mean from the sample statistics. Then formula would be— 

SE c (sample) 
„Е. of Mean— VC 
Illustration —6 


A sample of 1000 members is found to have a mean 3.42 c.m. 
Could it be regarded as a simple sample from a large population 
where mean is 3.3 c.m. and standard deviation is 2.6 c.m ? 

The S.E. of Mean= т 
26-026 
4/1000 3162 


—.082 
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Difference of the sample mean and population Mean 
—3.42—3.8—.12 
This difference is less than 3 of зе. hence it is not significant. 


Jllustration—7 


Given the following information about a random sample of 
individual items. Estimate with 99.7 percent probability the 
limits within which the population mean lies. 


Mean—172 
S.D. — 12 
№ = 82 
S.E. of Mean—2 (Sample: эзш ple) 
\/o— 1 
12 12 
=—_ = — —13 
№821 79 


At 99.7% probability the true mean will Не + 3 S.E. from 
the mean i.e. 


172 + 3.9—175.9 to 168.1 
Standard Errors of Median, quirtiles etc. 


S.E. of Median =1.25331 .? 
vn 


S.E. of Quartiles —0.36263 —° 
Vn 


‘S.E. of 1st and 9th Decile —1.70942 


vn 
„ 2nd and 8th ,,  —1492877 ce 
„ 8rd and 7th ^,,  —1.931800 —2— 
„ 4th and 6th = .—126804 —9—— 
vn 


Illustration —7A 
The data concerning heights measurement for a random 
sample of individuals from a given population are as follows. 
Меап—172” 
SD. 197 
№ = 65 
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If а large number of samples of the same size were selected 
at random from the given population, what would be the limits 
of the 5%, 2% and 1% significance level for the true mean ? 


В.Е. of Mean= Abore) 12 


маі г 651 
at 5% Significance level (1.5x(1. 96)=2.94 
at 2% A „ (1.5X2.326)—3.489 
at 1% Е „ (1.5 2.576) =3.864 
at 595 Significance level the true mean 


will lie between—172 + 2.94 
—174.94 to 169.06 
at 295 significance level the true mean 
will lie between—172 + 3.489 
—175.489 to 168.511 
at 195 significance level the true mean 
will lie between—172 + 3.864 
==175.864 to 168.136 


Observed. differences between Average 


zl 


If two samples are taken from the same population, there 
may be some difference in their means. This difference may 
be either due to chance or due to some other factors. If the 
difference is due to chance, we conclude that the samples have 
been drawn from the same universe, on the other hand if 
‘difference is large, it is taken as significant and cannot be 
attributed to chance alone. 


The standard error of the difference between two means is 
computed by the following formula— 
2 


NUS Ae AE 
(S.Ej-2) нан | +7 


ог 
S.E,,— УЕ? _ 


oo? 
Illustration—8 


A random sample of 200 villages was taken from Gorakhpur 
district and the average population per village was found to be 
485 with a standard deviation of 50. Another random sample of 
200 villages from the same district gave an average population 
of 510 per village with a standard deviation of 40. Is the 
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difference between averages of the two samples statistically 
significant. Give reasons. 
2 2 
S.E,-?—= citur 
_ (50)? (40) 
772005837 200—1 
S.E,-,—4.52 


—20.5 


or 
= 50 50 — 
"n VENT a 
с 40 40 
Jug шв 229,98 
Ss Ма ^ \/200 14.14 
S.E,-,—V(S.E;)?-- (S.E5)2 
=\/8.532--2.982 
—V/1246-L-8.88 —\/21.34 —4.5 
The observed difference between the two averages— 
510—485—25 which is greater than 3 times the S.E. Hence it 
is not due to sampling fluctuations. 


Test of Significance 


The test of significance for the difference betwee the two 
means can be known by the following formula— 
D—Dh 
S.E.1-2 

T— Test of significance 

D=Difference between two sample means (not taking into 
account + or — signs) 

Dh=hypothetical difference which is taken is zero 
If value of T exceeds §2.576, the difference between the two 
means is significant if not, not significant. 

By applying this formula in the above example we get 

(510—485)—0 25 
= 45 podus 


The difference is much larger than 2.576 hence difference 
is significant. 


3.53 


Ш 


Standard Errors of Other Estimates 


(1) Difference between two medians 


—4/1.25381 + 1125331 22 
V п Jai ^/ Ny 


| 
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(2) Standard Error of Mean Deviation 
—.6028 —— 
vn 
(3) Standard Error of Standard Deviation 
o 
v2n. 
(4) Standard Error of variance 
| TU px 
== 
(5) Standard Error of coefficient of variation 
TRMIVI DPA M 
—JUm Wit 10 
(6) Standard Error of Quartile Deviation 


с 
ya = 
(7) Standard Error of Karl Pearson's coefficient, of skewness 
| > ХОЗ 
==. 


(8) Differenee between two standard deviations 


=.78672 


pe pte reek 
2n, ‘212 
(9) Standard Error of Coefficient of Correlation 
| ‚1— 
ME. 
(10) Standard Error of regression Coefficient 
| ox 1—r? 


суу 
(11) Standard Error of regression estimate 
y on х= gy /1—1? 
(12) Standard Error of the coefficient of association 
asque imei for l 
— g Маву (Ab) aB) T (ab) 


Illustration —9 

To study the correlation between the stature of the father 
and the stature of the son, a sample of 1200 is taken from the 
universe of fathers and sons. The sample study gives the 
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correlation between the two to be .46. Within what limits does 
it hold true for the Universe ? 


1—12 
The S.E. of с=т » 
2 —. б 
__1—(.46) si —.2116 i 1884 778 
~ 4/1200 34.64 ^ 34.64 


The r of the Universe will lie between 
46 + (3Х.0227)=.3917 to .5281 


Illustration—10 


Given the following data find out Standard error of Coefficient 
of Association. 


(AB)—55 
(Ab) —11 
(aB) —13 
(ab) —48 


Coefficient of Association or 
9— (AB) (ab) — (Ab) (aB) 
(AB) (ab) -.- CAb) (aB) 
(55548) — (11x13) - 
— (65x48) + (11X13) 
2640—143 
—2640-L148 =+.89 


а Burg 
уза Vamtan Мал + (ab) 


"sa .89)? 1 
DUE AG +з ag n 


Other formulae for different SM 
Standard Error of two sample means as explained above is 


oi? 
235 
If Standard Deviation s торы is given then the 
formula would be 
S.e. ANED (2 ta 
D; Ng 
If mean of sample number onè is compared with the 
combined mean of the two samples— 


< Pop2— Da дие 
уз E n, (n; -3). 


Зета 
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If the two samples are from such universe between which 
there is a correlation, the standard error of the sample means 
would be— 


Бе. S Jo c» 2r 71X82 


ny Ng ny--Ng 
Illustration—11 


In an intelligence test administered to 120 fathers and their 
200 children, the following results were obtained 
Fathers mean score 114 S,D.—12 
Sons mean score 110 S.D.—10 
Assuming г between the two to be +.8, calculate the 


Standard error of the two means and state whether the difference 
is significant. 


2 2 
с co с с 
Sel үз : еу хх 


пә ny n» 
ass (2002 ЛЕТ 
= 4 120 + 200 — X 8X59 200 
144 100 
B Mr 120 и 


The actual difference is 114—110—4, which is less than: 
three times the S.E., hence insignificant. 


The standard error of the standard deviation, can be 
calculated by the following formula, as has already been 
mentioned— 


If standard deviation of the population is given the 
following formula will be applied— 


SEL pop? UN E A 

ym 2s nı En 

If standard deviation of the sample one is compared with. 
the combined standard deviation of the two samples, the 
following formula would give the standard error. 


срор.2 
= EP xia — 
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SMALL SAMPLES 


The formulae and methods dealt so far are applicable in 
the case of large samples. Small samples present a different 
problem, hence they need a different treatment. The assumption 
of normality cannot be made in their case. The smaller is the 
sample, the greater divergence is found in the distribution. 
In order to measure the reliability of small samples Student 
(a famous mathematical statistician) has developed a test 
known as 't' and Fisher has developed another test known as 
'Z. These tests are based on 't' distribution and 'z' distribution 
respectively. 


The Standard Deviation of small samples should be found 
out by using the following formula 


S 
n—1 


n—1 represents the number of degrees of freedom for 
‘calculation of o. 


The test of significance of the mean in small sample is 
‘done by obtaining the value of Ф by the following formula— 


fec D 
с 
The calculated value of Ф is compared with the table value 
(Fisher and Yates table) for a given number of degrees of 
freedom and at a certain level of significance (generally 596 or 
probability—.05) and if it is greater than the mean, is signi- 
ficantly difflrent otherwise not. 


Allustration—12 

Ten individuals are selected at random from a universe and 
their heights are found to be in inches—62, 62, 63, 64, 65, 68, 
‘68, 69, 69, 70. In the light of these data discuss the suggestion 
‘that the mean height in the universe is 64 inches. 
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No. Height in inches | d from average 66 d? 
1 62 | —4 16 
2 62 —4 16 
3 63 —8 9 
4 64 —2 E] 
5 65 | —1 1 
6 68 +2 4 
7 68 +2 4 
8 69 +8 9 
9 69 +8 9 

10 70 44 16 

n=10 | 5660 88 
66б 66 
ауегаре= то 
BET a AER, 
gum =3.13 
V —1 
t DE 
g 
66—64 = 
== 10 —2.02 
| 3.18 M 2 


The number of degrees of freedom is n—1—10—1—9. Тһе 
value of t for 9 degrees of freedom at 5% level of significance 
is 2.262. 'The t is equal to 2.02, which is less than 2.262, 
: hence difference is not significant. 


In the same way to test the significance of difference 
between two sample means, the t is caleulated by the following 


formula— 
Ее Sate 
TR eb 
^ JE Tx = 
where S— ERES 0—2)? 
п; 0—2 
| Where 


| gi—mean of the samples 1 


X2— » ЕД 2 


n,—number of items in sample 1 


n= ” » 2 
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S=Standard deviation of the difference between two 
samples. 


Degrees of freedom—n, —1--n,—1 
Fisher has given a method of testing the significance of 


the difference between two correlations. The r is transformed 
into z by the following formula. 


* 
7—3 105° = 


1+r 
or—log 10 io. 1513 
1 


The Standard error of Z— MU 


Illustration —13 


А т of +.5 is discovered in a sample of 28 items. Apply 
Z test to find out if this significantly different. 


1 
Z—log,, [ir x11513 
pts 

—logio 25 Х 11518 


=ош E. 1.1513 
= logy) 3x(1.1513—.4771 (1.1513. —.549 
1 1 1 
— У 8" y28—8 7 55 
The Z is more than twice the S.E. hence significantly 
different from zero. 


S.E 


ED 


'Z' test can also be used to test the Significance of the 


difference of two sample coefficients of correlation by the following 
formula. 


4 can also be applied at this stage. The value of t is 
calculated by the following formula. 
(Es тс МЕ 
“= 86/2122 


* Natural or Nepharion system of logs. The relationship. 
between natural log and logy) which are ordinarily used is logro. 
—log,x (2.3026. 


— 
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Theoretical Questions 


1. Why sampling is necessary ? What is random sampling ? 
How would you conduct a family budget enquiry among one lakh 
workers of an industrial area ? 


(L. W. Officer’s Training Course of W. B. Govt.) 


2. Show the necessity of the uses of the method of random 
sampling in any extensive investigation. How would you make 
use of these methods in carrying out an economic survey of the 
rural areas of M. P. ? (B. Com. Nagpur) 


3. What do you understand by ‘random sample’? Is it a 
synonym for 'representative sample ? Why is random sample 
supposed to speak for the population ? То what types of enquiries 
is the technique of random sampling specially applicable ? 


4. "If we had to choose between pure random sampling and 
purposive sampling, our choice would probably be determined by 
balancing the uncertainties of the former which are mainly due to 
fluctuations of chance and the uncertainties of the latter, which are 
mainly due to bias." (Yule & Kendall) d 


Examine the above statement and explain if it is possible to 
evolve a method so аз to gain some of the advantages of each while 
minimising the disadvantages of both. 


5. "The standard error can be used to gauge the precision of a — 
statistical estimate or to permit a judgement being made of the 
divergence between expected and observed values. = Discuss 
explaining clearly the concept of standard error of an estimate and 
its various uses in practice. 


(Vikram M. Com.) 


6. Why is sampling necessary in statistical investigations ? 
Explain the important methods of sampling commonly used. 
(Vikram M. Com.) 


Practical Questions 


1. A coin is tossed 1000 times and the head comes out 550 
times, Can the deviation from expected value be due to fluctuations 
of simple sampling. 


(S.E. 15.81—Actual difference greater than three times, 
hence cannot be accounted for fluctuations of simple sampling) 


2. A coin is tossed 10,000 times and head turns up 5195 times. 
Is it reasonable to think that the coin is unbiassed ? (P.C.S.) 


(S.E. 50, Actual difference is greater than 3 times the S.E.) 
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8. A coin is tossed 400 times and it turns head 216 times. 
Discuss whether the coin may be an unbiassed one and explain 
briefly the theoretical principles you would use for this purpose. 

(Т.А.$.) 

(S.E. 10, coin is unbiassed) 


4. A random sample of 500 pineapples was taken from a 
large consignment and 65 were found to he bad. Estimate the 
proportion of the bad pineapples in the consignment as well as the 
standard error of the estimate. Deduce that the percentage of 
bad pineapples in the consignment almost certainly lies between 
8.5 and 17.5. 

(I.A.S.) 


(S.E. of proportion—.015— Proportion of'bad pineapples lies 
between (.1f+ 35.015) 


5. А sample of 900 days is taken from Meteorological records of 
а certain district and 100 of them are found to be foggy. What are 
the probable limits to the percentage of foggy days in the district ? 
(S.E. of proportion.0105—percentage is 8 to 14.25) (Р.С.5.) 


6. In a locality containing 18000 families, a sample of 840 
families was selected at random. Of these 840 families 206 were 
found to have a monthly income of Rs. 50 or less. Tt is desired 
to estimate how many out of the 18000 families have a monthly 
income of Rs. 50 or less. Within what limits would you place your 
estimate ? (S.E. of proportion—.015, 360 to 5220 families) 


7. Calculate the standard error of the mean from the 
following data collected in one of the many randam sample inquiries 
conducted to find out average earnings of a particular class : 


Earnings P.M. Number of Persons 

in rupees 

upto 10 50 
» 20 150 
» 80 300 
» 40 500 
„ 50 700 
„ 60 800 
» "0 900 
» 80 1000 

(S.E. .61) (M. Com. Alld.) 


8—It is known that the mean and standard deviation of & 
variable are respectively 100 and 10 in the universe. It is 
however, considered sufficient to draw a sample of sufficient size - 
as to ensure that mean of the sample would be in all probability 
within .01% of the true value. (Ans. М№=9000000) 
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9— What is meant by the standard error and what are Из 
practical uses ? 


Intelligence tests on two groups of boys and girls, give the 
following results. Examine if the difference is significant :— 


Girls Mean 84 S.D. 10 No. 121 
Boys Mean 81 S.D. 12 No. 81 
(Р.С:5.) 


(S.E. 1.61—Actual difference is less than three times the 
Standard error.) 


10—А sample survey of 225 high school students in Gwalior 
gives an average expenditure of Rs. 45=50 per month. Is it likely, 
this was a random sample from a population with monthly average 
expenditure of Rs. 50 and with standard deviation of Rs. 1.80 ? 


(S.E. of Mean—.12 and T=—4.16) 


11—Random samples drawn from two countries gave the 
following data relating to the height of adult males. 


Country А Country B 


Mean Height in Inches 67.42 67.25 
Std. Dev. 2.58 2. b 
No. in Sample 1000 1200 


(i) Is the difference between the means significant ? 


(1) Is the difference between the Standard Deviations 
significant. 
(S.E,,-9=.109, T=1.56, S.E. of S.D. .077) 


12—A man buys 100 electric bulbs of each of two well-known 
makes, taken ай random from stock for testing purpose. He finds 
that make A has a mean life of 1300 Hrs. with a standard 
deviation of 82 Hrs. and make B has mean life of 1248 Hrs. with 
standard deviation 93 Низ. Discuss the significance of these 
results. 


(S.E. ,.,—12.87, T=4, difference in means significant, S.E. 
of S.D.—8.762 and T—1.25, S.D. is not significant) 


13—А random sample of 900 pairs of observations shows a 
coefficient of correlation of .35. What are the 95% limits to the 
correlation in the population ? What would these limits be if the 
sample had 3600 pairs of observations ? 

. [S.E. of r (for N=900) .02925 and (for N=3600)=.01462° 
959/ limits (for М 900) .407 and .298 
» (for М 3600) .379 and .321] 

14—Two groups of students are given an intelligence test 

(x) and an arithmetic test (y) 


n4—45 ri—.45 
195—899 2 r9—.898 
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Is the difference between the values of r significant. 
(Z1—.4854, Z4—.401, S.E.—.227) 
difference not significant. 

15—А random sample of 1000 farms in a certain year gives 
an average of wheat 2000 lbs per acre with a standard deviation 
of 192 lbs. A random sample of 1000 farms in a following year 
gives an average yield of 2100 lbs per acre with a standard 
deviation of 224 Ibs. Show that these data are inconsistent with 


the hypothesis that the average yields in the country as a whole 
were the same in the two years. (Vikram 1960) 


(5.Е. 1-2=9.32, T=10.7, hence difference is significant) 

16— Given the following frequencies in an association table :— 
(AB)=300 (aB)=140 
(Ab) =200 (ab) =120 


Do you think that the association could have arisen as а 
fluctuation of simple sampling, the true association being zero. 


(S.E, of the difference in proportions—.028, Deviation between 
the proportions—.055) 


17—Explain what do you understand by standard error of the 
mean of a random sample. 


. The data concerning height measurements, for a random 
'sample of individuals from a given population are as follows :— 


Mean=172 cm. 
S.D. —12 ст. 
Nen egg 


If a large number of samples of the same size were selected at 
random from the given population, what would be the limits of 
the 2% confidence interval for the true mean. (M. Com. Alld.) 
(S.E. mean=1.5—168.52 to 
175.48) 


18—The median height of 100 M. Com. students is 66” with 
a standard deviation of 3” and the median height of 121 M.A. 
students is 64" with a standard deviation of 4". Are M. Com. 
students taller than M.A. students ? 


(S.E. Median .375993, .45575, S.E. of difference=.59) 
19—Personnel officers from various subsidiaries of a telephone 
holding company met to discuss the problem of absenteeism among 
women employees. Random samples of 65 employees was conduc- 
ted and their absenteeism was observed for one month. Two of 
these officers reported. 
х1=6 Hrs. o1:=2.0 Ni=65 
595—8 Hrs. og3—2.7 №=65 
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, 
This comment was made, “The small difference you see here 
is doubtless due to the use of samples. I doubt very much that 
our absenteeism is any different from the other office". 


Test this statement of the null hypothesis. 
(Ans. бет. 
=.4, T=5, absenteeism is significant.) 
20—Suppose a sample of 197 families is randomly chosen out 


of 8000 families residing in an area, and their indebtedness position 
is analysed. The following table describes the position. 


Indebtedness No. of Indebtedness No. of 
families families 

0— 20 A 1 100—120 E 48 
20— 40 w 8 120—140 c3 31 
40— 60 A 14 140—160 "e 6 
60— 80 It 29 160—180 Yr 2 
80—100 эз» 67 180—200 ae 1 
197 


What is the range within which the average indebtedness of 
the 8000 families is likely to be ? 
(Ans. Rs. 96.69 to Rs. 98.73) 
(Cal. Univ. Dip. S. W.) 


21—In a random sample of 800 adults from the population 
of a certain large city 600 are found to have dark hair. In a 
random sample of 1000 adults from the inhabitants of another 
large city 700 are dark haired. Show that the difference of the 
proportions of dark haired people is about 2.4 times the standard 
error of this difference from samples of the above sizes. 


(Ans. S.E. of the difference of proportions is=.00045, and 
difference of proportion is—.05 which is 2.4 times the S.E.) 


22—One thousand articles from a factory are examined and 
found to be 92.5 per cent defective. Fifteen hundred similar 
articles from a second factory are found to have only 2 per cent 
defectives. Can it reasonably be concluded that the products of 
the first factory are inferior to the second ? 


(Ans. Difference between proportions is 57, S.E.=.65%. 
Articles of both factories are of similar qualities) 


23—А random sample of 1000 men from Northern India gives 
their mean wage to be Rs. 2=50 per day with a Standard Deviation 
of Rs. 1=50. A sample of 1500 men from Southern India gives 
a mean wage of Rs. 2=69 per day with a Standard deviation of 
Rs. 2—00. Discuss whether the mean rate of wages varies as 
between the two regions. (B.A. Hons. Delhi) 


(The S.E. of the difference between the mean=.005 which 
is more than 8 times the real difference, hence difference is not 
due to fluctuations of sampling) 

37 


СНАРТЕЕ 14 


PROBABILITY 


"Probability theory is of interest, not only to card and dice 
players, who were its godfathers, but also to all men of action, 
heads of industries or heads of armies, whose success depends on 
decisions, which in turn depend on two sonts of factors the one 
known or calculable, the other uncertain and problematical". 

Ewing Borer 


Development of the Theory. The first impetus to the 
Study of the theory of probability came from gamblers. 
Always eager to develop a system, having great faith in Goddess 
‘Fortune’ or superstitions, some gamblers began in the 
seventeenth century to take their gambling problems to eminent 
mathematicians. In such a fashion, Galileo, Pascal, Fermat and 
others were persuaded to apply their genius to problems of 
chance, which opened up the way to the development of 
probability theory. Pascal’s important contribution to this 
science stemmed from some work on gambling problems, that 
his gambler friend “йе Méré asked him to solve. Other pioneering 
work in this field was done by a man who was a gambler 
himself, the Italian Cardano. He was also an astrologer, 
murderer, thief and scientist with a fantastic career. His name 
18 still connected with a solution of a famous algebraic problem. 


By the late eighteenth century, the study of the laws of 
chance no longer needed the helping hand of gamblers. A 
number of scholars had addressed themselves to the field of 
probability and early in the nineteenth century the famous 
Frenchman Laplace and the German Gauss carried the knowledge 
of the subject many important steps forward. 


With the expansion of national economies some great persons 
like, De Moivre, Nicholas, Daniel Bernoulli, Euler and 
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D’ Alembert were inspired to develop probability theory further 
and aply it for the first time to the financial, publie health, 
military and political fields. In more recent years R.A. Fisher, 
Father and son Pearson, and J. Neyman developed a sampling 
theory based on the laws of probability. Today a comprehensive 
theory of probability exists. Use of Probability theory in the 
fields of business and. economics has been relatively slow, as 
compared to its application in the fields of Physics, astronomy 
and genetics. 


Meaning. Probability has an every day meaning. 
Sometimes we hear phrases like “His chance of winning is 
pretty small’. “It is pretty likely that it will rain before 
tomorrow”, “you are probably right,” or ‘There is fifty-fifty 
chance of Success”. In each of these phrases an idea of 
uncertainty is acknowledged. Goethe remarked that, "There is 
nothing more frightful than ignorance in action.” Reasoning in 
terms of probabilities is one weapon by which we attempt to 
reduce this uncertainty ог ignorance. The use of word 
‘probability’ in statistics however is some what different. It is 
more precise than what it means in popular use. According to 
Laplace, “Probability is the ratio of the number of favourable 
cases to the total number of equally likely cases.” In other 
words, “If there are severally equally likely events that may 
happen, the probability that any one of these events will happen, 
is the ratio of the number of сазез favourable to its happening 
to the total number of possible cases." 


То define in mathematical terms, if an event can happen in ‘а’ 
ways and fail in ‘b’ ways, and all these ways are equally likely: 
\ 


hy ug 
to occur, then the probability of Из happening ган and the 


For example if we 


Bs ПЕРНО 
probability of Из not-happening 18 alb. 
toss a coin, then probability of coming down is 1, and of the 
head coming up is 14 (or P—1/5) and of the head coming down 
is 15. à 

Illustrations 


(a) What is the chance of drawing a king in a draw from 
а Pack of 52 cards. 


Total Number of cases that can happen=52 


580 AN INTRODUCTION TO MODERN STATISTICS 


No of favourable cases—total number of kings in the pack 
of cards—4 
peus 1 
Th babilily. а 
e probability i E 
(b) An urn contains two blue balls and three white balls. 


Find the probability of a blind man obtaining one blue ball in 
a single draw. 


в 
2-8 — 5 

(c) The Registrar of Vital Statistics in a country reported 
6858 sons and 6543 daughters for a specific region. Treating 


this as a fair sample from the general population, estimate the 
probability that a child to be born will be a boy. 


Р 


The event сап happen їп 6858 ways and can fail in 6543 
6858 _ 6858 
6858-16543 ' 13401 
(d) What is the probability that a vowel selected at random 
in any English book is an ‘i’. 
b=total number of equally likely events—5 


way hence. P= 


a—number of favourable events =1 
а 1 
Р=р-=р- 


(e) What is the probability of a king in a Pinochle deck 
(A pinochle deck consists of 2 aces, 2 kings, 2 queens, 2 jacks, 
2 tens and 2 nines of each suit. There are no cards of lower 
value) 


Total Number of cases—Total No of cards—48 


Total Number of favourable cases—Total No of kings in 

the deck—8 
8 1 
The P= 48=6. 

Probability when expressed numerically, takes the shape 
of a fraction. The fraction can never exceed one, or in other 
words probability of happening of any event can never be more 
than one. The probability of happening of any event cannot 


be9.. Because number of favourable events cannot be more 
2 


than total number of all events that can happen. If the 
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probability of happening an event is equal to one, it means that 
all equally likely events are favourable. The event in question 
is certain to happen. If, on the other hand, the probability 
of an event is equal to 'zero' it means that out of the total 
number of equally likely events, none is favourable. In other 
words, it is impossible that event will happen. The probability 
expressed in fraction can be turned into percentages by 
multiplying it by 100. For example, the probability of drawing 
а queen out of a pack of 52 cards is 


4 1 

52 9 18. 

Mathematical vs Statistical Probability. Where we have 
full confidence of an event happening out of several possible 
alternatives (even before the event happens), which are 
mutually exelusive, then the probability of the event happening 


or a х100—7.7%. 


is easily assessed and it is аз explained above— Th 
This is known as ‘Mathematical Probability’ or ‘А priori 
probability. Its use is associated with games of chance, such 
as throw of a coin or a dice, where we know for certain the 
number of possible alternatives—such as head or tail of the 
coin turning up in the throw of a соїп or any of the— 
six faces marked 1, 2, 3, 4, 5 and 6 turning up in the throw 
of a dice. 


On the other hand, without any certainty of the event 
happening, if we have to base the probability on past experience 
based upon a long series of experiments, then the probability 
is known as ‘Statistical’ or ‘Empirical’ or ‘aposteriori’ probability. 


A precise definition of statistical probability will be :— 


“Tf a large number of trials be made under constant 
conditions, then the ratio of the number of trials, in which a 
certain event happens to the total number of trials (N) would 
approach a limit, as N is indefinitely increased, and this limit 
is the probability of the event happening.” 


There is a major difference between a priori and empirical 
probabilities. A priori probability can always be expressed as 
a precise number while an empirical one is an estimate and 
therefore only an approximation. Most of probabilities of 


1 Latest Statistical Methods—M. Vaidyanathan р. 309. 
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interest of business and economics are based on empirical 
probabilities. 


Events 


(a) Equally likely events—In stating the theory of pro- 
bability, the phrase ‘Equally likely events’ is used. These are 
the events, from which any one may happen. While tossing 
а coin, either head will turn up or tail will turn up, thus equally 
likely events are two. The following illustration will clearly 
indicate the meaning of ‘equally likely’. 


Two players A and B are tossing a coin with the following 
rules : The coin is to be tossed twice. If the head appears 
on either toss A wins, if head does not appear B wins. 


All equally likely cases will be : 


I Toss II Toss 

H H In this case A wins 
H T 7 Аз» 
T H P А-у 
T E » B 


Hence the chance of A's winning is 34. 

(b) Mutually Exclusive Events—Mutually exclusive events 
are those events the occurence of which prevents the possibility 
of the other or vice versa. For example, if we toss a coin, the 
coming up of head will prevent the turning up of tail. Similarly 
if we throw a die, any of the six faces may be uppermost when 
the die comes to rest. Total number of possible ways is six 
for a single throw. The different cases are mutually exclusive, 
since no two faces can be uppermost at the same time. 


(c) Independent Events—Events are said to be independent, 
if the occurence of one does not affect the occurence of any 
of the others. Two events are independent when they have no 
influence on each other. The result of the first toss of a coin 
does not affect the result of the second toss at all. 


(d) Dependent Events—If the occurence of the one event 
affects the happening of the other event then they are said 
to be dependent events. For example the probability of 


drawing a king out of a pack of 52 cards з= от Гы пзе 


happens that king is drawn and із not replaced in the pack, the 
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8 
probability of drawing again a king would bezi . Hence the 


happening of the first event has effected the happening of the 
second event thus they are dependent events, 


Probability Theorems 


(1) Addition Theorem—The theorem is stated as follows :— 
“The probability of either one or another of any number of 
mutually exclusive events is the sum of probabilities of the 
separate events happening. The theorem is simple and self- 
evident. For example the probability of getting spot (1) in a 


throw of a single die is =the probability of getting spot (3) is 


1 1 
2154 : and the probability of getting spot (5) too is v The 
probability of getting an odd number (1, 3 and 5) in a throw of 
single die will be the addition of their respective probabilities, 
1 3 1 


ad =. 


1 1 
that i 
5+6 T 


Illustration—1 

A bag contains 4 white, 2 black, 3 yellow and 3 red balls. 
What is the probability of getting а white or red ball at random 
in a single draw of one. 


4 
The probability of getting one white ball— i 
d ball— 3 
re x 


» , white or red ball= 


4 3 T 
iat i2 ~i2 


1 
15 X100=58.3% 


The addition theorem will hold good only if— 
(a) items are mutually exclusive 
(b) Mutually exclusive items belong to same set. 
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A bag contains 25 balls, numbered from 1 to 25, one is to 
be drawn at random. Find the probability that the number 
of the drawn ball will be а multiple of 5 or of Т. 


The probability of the number being multiple of 5 


5 
5, , , , "= 
(5, 10, 15, 20, 25.) 55 
The probability of the number being multiple of Т 
8 
yd 4, 1, Г === 
(T, 14, 21) 25 


Thus the probability of the number being a multiple of 5 
or 7 will be 
5 3 8 
25 + 35 = 25 
In the above illustration, find the probability that the 
number is a multiple of 3 or 5— 


The probability of the number being multiple of of 3 
8 
(3, 6, 9, 12, 15, 18, 21, 28)— 55. 


The probability of the number being multiple of 5 


5 
(5, 10, 15, 20, 25) — 55 


T 8 5 13 
Joint bability — EE a e i i ^ 
oint probability 25 +55 25 but this answer is wrong 


because item No 15 is not mutually exclusive. Hence the correct 


12 
robabilit; ill be— — 
p ity will be E 


It is also essential that mutually exelusive items must belong 
_to the same set. To illustrate this point Von. Mises has given the 
following example : 


Suppose the probability of a man dying between his 40th 
and 41st birthdays is 0.011, and the probability of his marrying 
between his 41st and 42nd birthdays is 0.0009. These events are 
inutually exclusive but it cannot be said that the probability of 
а шап dying in his 40th year and of marrying in his 41st year 
is .011-L.009—.02. These two events do not belong to the 
вате set. 
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(2) Multiplication Theorem—1£f а compound event is made 
up of a number of separate and independent sub-events, the 
probability of the occurence of a compound event is the product 
of probabilities of each of the sub-events happening. Since two 
or more sub-events happening are independent of one another, 
obviously the number of ways in which a compound event can 
happen, is the product of the number of ways which each sub- 
event can happen, and on this basis, we find out the probability 
of each compound event happening. Thus if two coins are 
tossed independent of each other, since with each toss there are 
two ways in which each sub-event can happen, the total number of 
compound events possible is 2x(2—4 so that the probability of 

1 


a single compound event happening is Th 
Illustrations 


(A) What is the probability of throwing two ‘fours’ in two 
throws of a die. 


The probability of a ‘four’ in first throw= Ens 
1 

second throw— e 
1. 


i ‘ 7, 1 ЕС 
The probability of two 'fours'— Cl X 6-7 36 


(B) What is the probability of getting all the heads in four- 
throws of a coin. 


ak 
The chance of getting head in the 1st тоу 
1 
» » 2nd ” one 
” . 3rd P » 
2 
HO Uu ., cde 
2 


Thus the probability of getting heads in all throws. 


_1 Ў 1 Е 1 * 1 1 
Бе oa odi. = 164 
(C) Suppose it is 9 to 7 against a person A who is now 35 


years of age living till he is 65, and 3 to 2 against a person B: 
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now 45 living till he is 75 ; find the chance that one atleast of 
these persons will be alive 30 years hence. (B. A. Punjab) 


y.9 
The chance that A will die within 30 years is 16 апа Ше 


chance that B will die within 30 years is $e The events are 


independent, therefore, the chance that both will die is 


Then chance that both will not be dead i.e. atleast one will 


be alive is 1— e eee 
80 80 


(D) A problem in statistics is given to three students A,B,C, 
"whose chances of solving it are на respectively. What 


234 
probability that the problem will be solved ? (M. Se. Agra). 


Probability that student A will fail to solve the problem 
1 1 


Ee wt 


2 2 


Probability that student B will fail to solve the problem 
P 2 


—1-- = 


3 


Probability that student C will fail to solve the problem 
1 3 

m us 
4 4 


Since the events are independent, the probability that all 
the three students A,B,C, will fail to solve the problem 
1 2 3 1 
—LL X cu — = 
a WE 
.'.The probability that the problem will be solved 
1 3 


=l1—__ = _ 


4 


Moroney, in his book “Facts from Figures’ explains the 
multiplication theorem by the following illustration.” 
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"Consider the case of a man who demands the simultaneous 
‘occurence of many virtues of an unrelated nature in his young 
lady. Let us suppose that he insists on a Gracian nose, 
Platinum—blonde hair, eyes of odd colours—one blue one brown, 
and finally a first class knowledge of statistics. What is the 
probability that the first lady he meets in the street will put 
ideas of marriage into his head ? 


It is difficult to apply multiplieation theorem in this case 
because events do not belong to the same set. Hence this 
theorem will hold good only in the cases where events belong 
to the same set. 


(3) Theorem of Conditional Probabilities—If sub-events are 
not independent, and the nature of the dependence is known, we 
have the theorem of conditional probabilities. This theorem is 
more or less corollary of the Multiplication theorem. The 
theorem is that the probability that both of two dependent sub- 
events can occur is the product of the probability of the first 
sub-event and the probability of the second after the first sub- 
event has occurd. For example, if out of a pack of cards 
shuffled each time ‘King turns out first, and the card is not 
restored ; then in a second reshuffling the probability of ‘king’ 


turning up — x = since there are 4 kings at the first 


shuffle of 52 cards and 3 kings only at the second shuffle of 51 
cards. The term ‘conditional probabilities’ is often known as 
probabilities due to partial exhaustion of a sample. 


(4) Formula of combinations applied to Probability—In 
finding out the probability of happening of an event, we may 
have often to select a sample out of a larger number, and then 
to calculate the probability of happening of an event out of the 
sample. If out of 50 articles 40 are good and 10 are bad, and 
if a sample of 5 be selected out of 50 at random, the probability 
of having 8 good and 2 bad in the sample would be 


40са 1002 
50cm 


(Since З good can be selected іп 40c, ways and 2 bad іп 106» ways) 
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Permutations and Combinations In the Theory 
of Probability 


Permutations. Sometimes we are interested in the total 
number of different ways in which items can be arranged so that 
the order of components is important yet no two arrangements 
are similar. Arrangements of this sort are called permutations. 
For example if seven alphabets—A, В, C, D,E,F,G, are to be 
arranged by taking two letters at a time, but under no circu- 
mstances may an arrangement contain the same 2 letters (like 
AA, or BB etc), then the following permutations are possible : 


AB AC AD AE AF AG) 
BA BC BD BE BF BG 
CA CB Ср СЕ СЕ CG 
DA DB DC DE DF DG 
EA EB EC ED EF EG 
FA FB FC FD FE FG 
GA GB GC GD GE GF 


6 
Hence there аге 76—42 permutations. 


Thus following formula can give the number of permutation 
Perm.—n(n—1) 


If 26 letters are to be arranged in this manner the total 
number of ways will be 26 (26—1)—650. 


Illustration—s 


If a man has the choice of travelling between Bhopal and 
Delhi by 8 trains, in how many possible ways he can complete 
the return journey, using a different train in each direction ? 


For the outward journey he has the choice of using all the 
8 trains. Having completed the outward journey, he will be left 
with only 7 trains to complete the return journey. Thus the 


total number of ways in which he can complete the journey 
are 8 (8—1)—56. 


“Thus, if there are “т” ways of doing an act, and ‘n’ ways 
of doing another act, there are туп ways of doing those acts 
ог N=mXn. Similarly if there are again p ways of doing the 
act then N=m np, and this can further be extended.” 
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If more than two operations are {о be performed the 
formula will be 


perm—n (n—1) X (n—2) x (n—3) X (n—4) 
Illustration 4 


There are six doors in a room. Four persons have to enter 
it. In how many ways they can enter from different doors ? 
Perm.—n(n—1) (n—2) (n—3) 
=6 x (6—1) (6—2) (6—3) 
=6 X5 X 4X 33—360 
The formula can also be written 
: n! 
н 1 
The sign (!) is known as ‘factorial’ 
In the above illustration 
6! 
(6—4)! 
_ 6X 5XAX9Xx2x1 
ESSO ST 


Perm.— 


Кыыл се 
Illustration 5 
In how many ways can 12 seats be occupied by 6 students ? 
n! 12! 
(m—r)! (12—6)! 
12x 11x 10x 9X 8X 0X6 XX 4X 9X 2X1 
E 6X5X4X3X2X1 
=12х11х10хХ9х8х1—6,65,280 
The number of ways in which М--М--Р thing can be 
‘divided into three groups containg m, n, p things respectively. 


(mnp)! 


P i = xe 


— m!ixn!xp! 


Perm.— 


3m! 
If m=n=p, the formula becomes— c m xam 15681 


Illustration 6 


In how many ways can 12 students be allotted to three 
tutorial groups of 2, 4, and 6 respectively ? 
(mtn+p)! (2444-6)! 
mlxn!xp!  2!x4!x6! 
1211104980765 х4х8х2Х1 

=ехохахзхахих (6X5x4x9x2x1) 
121110987 _ 13860. 
= 4X3X2X2X1X1 
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If tutorials groups are to made of 4 each then 
3m! 4x3! 
m!xmlxm!x3! 7 4x46 x41x31 
_ 12X 11x 10x 9x 8x6 X5X4X3X2X1 MN 
~ (4X3X2X1) X 4x3x2xDX(4x3x2x) x (8x2xD 
=5775. 


If there are n things in which p are of one kind, q are of 


the other kind and r of a third kind, then the number of 
permutations will be :— 


n! 
енй mg em 
p'xq!'xr! 
Illustration—7? 


In how many ways can the letters of the word “ВЕТТЕВЕ” 
be arranged ? 


In this letters E comes thrice, "T" comes twice ‘B’ and ‘В’ 
come only once. 

n! 7 
pixa!xr! зг 
— 1X6X5X4X3X2X1 5040 
C (8X%2X1) x (2X1) X1XI ~ 12 


Combinations. In permutations the order of the grouped 
items is important ; in combination the order does not matter. 
Combinations are arrangements of items where order is | 
unimportant and duplication of components is inadmissible. 


Hence Perm.— 


—420 


If letters A,B,C,D,E are to be arranged in twos, but the 
same letters are not to be used at the same time, the permuta- 
tions will be 20—Perm=n(n—1)—5(5—1)—20 

AB. AC, AD, AE, ) 
BA, BC, BD, BE, | 
CA, CB, CD, CE, L 
DA, DB, DC, DE, | 
BA, EB, EC, Е; 


In this case combinations will be : 
AB, AC, AD, AE, 
BC, BD, BE, 
CD, CE, 
DE, 
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n! 5! 
C= 57 @or)!—2! (6—2)! 
—5x4x3x2x1 —10 
2x1x83XxX2x1 
or 
nor p 
Where ner—nuniber of combinations with r things at a time 
npr—number of permutations with r things at a time 
r!—factorial “т” or number of permutations within 
a combination 


Frequency Distribution. Frequency distribution of certain 
events can be found out mathematical. Such distributions are 
generally called ‘Theoretical Distribution. Theoretical frequency 
distributions are of many types, but three of them are of great 
importance in statistical analysis. They are :— 


(A) Binomial Distribution 
(B) Normal Distribution 
(C) Poisson Distribution 


A—BINOMIAL DISTRIBUTION 
Binomial distribution is associated with the name of Jacob 
Bernoulli (1654-1705) but it was published in 1713, eight years 
after his death. Binomial means ‘wo names’, hence the 
frequency distribution falls into two categories. A binomial 
population is one in which all observations are classified as 
having or lacking a certain characteristic. A binomial distribu- 
tion is a probability distribution expressing the probability of one 
get of dichotomous alternatives e.g. success or failure. 


If two coins A and B are tossed simultaneously, the possible 


outcomes are— 


зень 
ananla 


H, stands for head and T and stands for tail. 
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The probability of getting two heads simultaneously is 
1 out of four outcomes or М. That is also shown by our 
probability theorem 
Inthe I Toss Probability of Coming Неай—% 

py ils us » =% 

Probability of coming head simultaneously is 15»(15—l4. 
“Зо is also true of tail. 

Let p stand for the probability of success (H) and q for 
the probability of failure. Then the above four outcomes can 
“be expressed as— 


HH HT TH TT 

pp ра qp 9а 

— ius Mam aere — 
or p? 2pq q? 


'p?--2pq--d? is an expansion of (p--q)?. Hence for two 
independent events their combined probability is given by this 
-simple binomial expansion—(p--d)? 
The probability of coming head or p= 
2 By Tail or q=% 
Then (p--q)?— (15-1 )?—A --Y5 4-14 
This is exactly what explained above. 
Similarly if there are three coins—A, B, C, there are 
"following 8 possible outcomes. 


A B C 
“i = H =p 
H H p 

H T H | =3р?а 
т H H 

T H T 

T T H | е 
H T T 

T т т = 


The probability of getting three heads is %, (253116) 
and that of securing two heads 3, and of securing one head 
35 and securing no head %. If p=% and 9—1 


(р--4)5—=(%-ЕМЭ#= p!--3p!q--Spg?-Lq* =%--%--%-_% 
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The expressions of this type is just what we remember— 
(a--b)2—2?-1-b?--2ab 
(a-|-b)3—a3-|-3a?b-|-3ab?-]-b$ 


This expansion can further be done by adopting the 
following formula 


(р--9)" 


If we want to know the probable frequencies of the various 
outcomes in a given number of trials, then the following 
expression will be used. 


№(р--а)" 


Where N stands for the number of trials. n for the 
number of independent events. If there are 100 trials and two 
independent events the probable frequencies will be 
100(p-+-q)2 

—100(p?--2pq--q2), if р=%, a= 
or =100(14)-+100(14)+100(%4) 

—=25--50--25=100 

—25 for two successes 

—50 for one successes 

—25 for no successes 

Similarly if there aré 100 trials and 8 independent events, 
binomial expansion of 
N(p--q) "—100(15-1-15)? will give the probable frequencies 
100 (p?-F-3p?q--3pq*--q*) 

—100 (44) --100 (24) 4-100 (36) 4-100 76) 
—12.5-L37.5--97.5-1-12.5—100 


Comparison of Actual and Theoretical Frequencies 
Ilustration—8 | 


Six dice were thrown 128 times. Each 4, b, or 6 spot 
appearing was considered to be success and each 1,2, or 3 
Spot a failure. The results were 


88 


594 АМ INTRODUCTION TO MODERN STATISTICS 


No. of successes Actual Frequencies Expected Frequencies 


0 0 2 
1 10 12 
2 26 30 
Б] 44 40 
4 34 30 
5 14 12 
6 0 2 

128 128 


Expected frequeney—N (p-|-q)9—128 (p9-|-6p5q-|- 15p*q?-- 
20p?a3-I-15p?a*--5pq5--q9) —128 (1/54) 4-128 (9/54) + 
128(15/5,) -128(20/,,) --128(15/5,) --128(9/5,) --128 (1/54) 


Mean and Standard Deviation of Actual and "Theoretical 
Frequencies :— ў 


m |Е(А)ш (А) [4х8 fdx |fdx? |F(E) mf(E) [тах ? (E) fdx? (E) 
p m ps LAM 
0 0 OF) 99 0 0 2 0|- 6 18 
TACTO 10 | —2 | —20 | 40| 12 12 | —24 48 
B |96 52 | —1|—26 26] 80 60 | —30 30 
8 |44 | 182 0 0 0| 40| 120 0 0 
4 |84 |136 | 1 | +84) 34| 30] 120 | +30 30 
5 | 14 70 | 42| +28 | 56 19 60 | +24 48 
6 0 0 | +8 0 0 2 12| +6 18 
128 |400 +16 | 156 384 | 192 
122 
_ 400 _ 384 
а= 128 —=3.125 тов —3 
2 09, 16 |? 192 zr 
Um 128 — M28 кү one =V/1.5=1.22 


The Mean and S.D. of the theoretical frequencies can also 
be found out by the following formulas. 
Mean 
M=np 
M=Mean, n=number of independent events, p—probability 
of success 


M—6x15—3 (the same calculated above) 
Standard Deviation 
супра. um 
—V/6x 15x 15—3/1.5—1.22 
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B—NORMAL DISTRIBUTION 


The normal distribution was first discovered by the English 
Mathematician De Moivre (1667—1754): It was later re- 
discovered and applied in science (both natural and social) and 
in practical affairs by the French mathematician Laplace (1749— 
1827). It was also extensively developed and utilised by the 
German mathematician, physicist and astronomer Gauss (1777— 
1855). One of the first to make extensive use of the normal 
distribution in social statistics was the Belgian astronomer and 
statistician Quetelet (1796—1874). A pioneer in its application 
to biological data was the English anthropologist, biometrician, 
criminologist, geneticist, meteorologist psychologist and statis- 
tician Sir Frances Galton (1822—1911) a cousin of Charles 
Darwin. 


The normal distribution is the most important probability 
distribution in statistical theory. The probability distribution 
of most sample statistics are derived from and closely connected 
with normal distribution. The fundamental importance of the 
normal distribution in statistics arises from the fact that the 
measures computed from samples usually tend to be normally 
distributed whether or not the original data are normally 
distributed. Normal distribution curve is called Normal Pro- 
bability curve or Normal curve or Normal curve of error. A 
contemporary statistician W. J. Youden expresses his admiration 
about this curve in these words : “The normal curve of error 
stands out in the experience of mankind as one of the broadest 
generalisations of natural philosophy. It serves as the guiding 
instrument in researches in the physical and social sciences and 
in medicine agriculture and engineering. It is an indispensable 
tool for fhe analysis and the interpretation of the basic data 
obtained by observation and experiment."! 


The Normal curve is bell-shaped, symmetrical and asymptotic 
in both directions to the x-axis and depends on the two 
parameters х and о only, which are the mean and standard 
deviation of the distribution respectively. While it ig always 
bell-shaped and symmetrical about its mean, its actual shape 
is determined by the standard deviation of the distribution. 


Many of the statistical techniques of describing and interpreting 


1 Quoted by Н. M. Walker—Elementary Statistical Methods 
р 110 
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data are directly attributable to the properties of normal curve. 
Without it the technique of sampling could not have developed. 
The curve has the following shape :— 


NORMAL CURVE 


< CONCAVE NEAR 
SiN THE MEAN VALUE. 


IS 


1 
i 
[ 
i 


о. е POINT OF 
ВИ INFLECTION 
д! : UN АТ +O 


This normal curve is also represented by an equation, 
which is, 
DINE: —(x—x)2 
т o/ 9x ° 9g? 


Where— 

—Ordinate of the curve at а certain point 

o=Standard Deviation of the distribution 

a—The ratio of the circumference of a circle to its diameter 

approximately 3.1416 ; 
/2л=2.5066 

е—2.71828 (base of the Napierian Logarithm) 

The formula can also be written 
—x2 

o 


1 


Properties of the Normal Curve. The Normal curve has 
the following properties :— 


1—It is bell-shaped and symmetrical curve, 


2—The normal curve is monomodal, symmetrical about its 
peak and its tails extend indefinitely in both directions, that is 
although the curve comes closer and closer to the horizontal 
axis, it never touches it, 
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3—АП its measures of central tendency are equal, and they 
are located at the abscissa of the highest ordinate. 


4—All observations are included inside the curve above the 
X axis. 


5—The Mean ordinate cuts the curve into two equal parts. 
The dispersion of the frequencies on the one side is exactly 
the same as it is on the other side. 

6—Near the mean value, the normal curve is concave while 
near + 8 it is convex to the horizontal axis. The points of 
inflection ie. the points where the change in curvature occurs 
are ilo 

7—Within a range .6745 of the с on both sides of the 
mean 50% of the frequencies occur. This is probable error. 

8—First and Third quartiles are at equal distance from the 
median. 

9—Quartile Deviation—Probable error. 

10—Mean Deviation is .7979 or */; of the Standard 
Deviation. If mean deviation is added to the lower or sub- 
tracted from the upper quartile gives the Mean— Median—mode. 

11—The probable error is .845 of Mean Deviation. 

12—The area lying between the normal curve and the 
horizohtal axis is said to be the area under the curve and is 
equal to the number of frequencies in the distribution. The 
standard deviation distributes the area under the normal curve 
as given below :— $ 

G) Mean +1 covers 68.268% area. 34.134% area will 

lie on either side of the mean. 
(ii) Mean + 2 , covers 95.45% area. 


(11) Mean + 3 о covers 99.73% area. 
Mean + 3 o covers almost all the area, leaving only .27% 
area outside the curve. 


Illustration —9 
You are in charge of rationing in а State affected by 
food shortage. The following reports arrive from local 
investigators :— 5 З 
Daily calorific value of food available per adult during 
current period :— 


Area Mean S.D. 
A 2000 350 
B 1750 100 
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The estimated requirement of an adult is taken at 2500 
calories daily and the absolute minimum at 1000. Comment on 
the reported figures and determine which area in your opinion 


needs more urgent attention. (M. Com. B.H.U.) 
Area А Area B 

Mean + 3 с Mean + Зо 

2000 + (3Ж350) 1750 + (3X100) 


between 3050 and 950 calories [between 2050 and 1450 calories 
Hence area A needs urgent attention because there are 
some people to whom only 950 calories are available when the 
absolute minimum is 1000 calories. 
Normal curve can also be presented by means of the formula 
X—x 
Tige 


Illustration—10 
Monthly rent is normally distributed about a mean of 56.50 


rupees, with a standard deviation of 16.23 rupees. A random 
sample of 300 houses is taken. If the rents of these houses 
are arranged as a frequency distribution with classes under 10 
rupees, under 20 rupees, under 30 rupees and so on, what 
frequency would you expect in each class according to Normal 
Distribution. 


Е ^ E = 8 pS 

Rent Е BEE = mow 

in Rs. ГА. ES trae En a2 а $847 

PIE с $ReS| £3 | BSS 

Бо Hoes Hoo мно 

Hs UE NEN шше 

1 2 8 4 5 6 
0—56.50 — 

under 10 0 | 1628 Brad. 7.000 .0020 0.6 
10— 20 10 —92.87 4980 0102 8.1 
20— 30| 20 —9,95 4878 0393 | 11.8 
30— 40 80 | — 1.68 4485 1024 | 30.7 
40— 50 40 — 1.02 3461 1907 | 57.2 
50— 60 50 —0.40 1554 2425) 72.8 
60— 70 60 0.22 0871 .2096 | 62.9 
70— 80| 70 0.83 2967 .1298 | 88.9 
80— 90 80 1.45 4265 0538 | 16.1 
90—100 90 2.06 4803 0160 4.8 
100—110) 100 2.68| -4963 .0037 1.1 
Total 1.0000 ! 300.0 
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Column 5 is obtained from column 4 by successive subtraction 
of the area except for that class in which the mean lies for 
which two areas are added. 


Illustration —11 


The Gwalior Municipal Corporation installs 5000 electric 
lamps in the streets of the city. If these lamps have an 
average life of 800 burning hours with a Standard deviation 
of 150 Hrs. (i) What number of the lamps might be expected 
to fail in the first 600 Hrs ? and (ii) What number of lamps 
may be expected to fail between 100 and 900 burning hours. 

(i) Aecording to the Table the area lying between the 


Д f 0—800 600—800 
ordinate at 0 and at 600 Hrs. ie. at! o е апа 150° © 


is .50000 and .40824. 


Hence difference (0— 800) — (600—800) 
—.50000—.40824—.09176 


Thus expected number of failures will be 


.91763(5000—459 
700—800 900—800 
g ап 


а = 


(ii) Area lying between — 150 — 150 


—.24537--.24581=.49074 


490714х 5000—2454 is the number of expected failures of 
lamps. 


Illustration—12 


Average sales of Bata Shoe Co's multiple shops is Rs. 12500 
per month with a standard deviation of Rs. 4050. Find out 
what proportion of all shops gold between Rs. 12500 and 16550. 

x— X 16550—12500 
= Sm 

By referring to table we find the area coverd by the normal 
сигуе—=.34184 

34134 100=34.184% shops sold between 12500 and 16550. 


Advantages of Normal Curve. Normal curve is of a very 
great importance in the sphere of statistical theory and of 
statistical interpretation. Nearly all the natural phenomena 
have the characteristic of normal distribution. There may be 
exceptions here and there to this type of distribution. As it 
is the curve of errors on the basis of x and о, we can decide 
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- whether a particular observation can be included in or is to 
be omitted from the observations under study. Normal distri- 
bution provides a test for scrutiny of any observed data. 
Besides this, the normal curve is the only criterion for judging 
from a sample of a population, the character of the population 
itself, or for comparing samples on the assumption of normality 
of the population and on the basis of normality of distributions 
of specific sampling coefficients. All modern statistica] theories 
Concerning unimodal curves, the design of experiments, and their 
Statistical interpretations have been developed on the basis of 
this curve. The major applieation of the normal curve concept 
is in the field of industrial and agricultural problems. 


C—THE POISSON DISTRIBUTION 


Simeon Poisson, a nineteenth century French mathematician 
developed a concept of frequency distribution in 1837 which is 
known as “Poisson Distribution”. This distribution is applicable 
in cases where the probability of success is either very high or 
very low. According to Poisson's distribution probabilities of 
certain events are determined by the following equation :— 

d"—e -a 8^ 
n! 
Where— 

e—the base of the natural logarithms and has a value of 
2.7183. 

a=Arithmetic average 

n—Number for which probable frequency is to be calculated. 
Ilustration—13 


Fit a Poisson’s distribution to the set of the observations :— 


Deaths 0 1 2 3 4 
Frequency 122 60 15 2a: == 1 
and calculate the theoretical frequencies. (M. Sc. Agra) 
expected 
m mf frequencics 
0 1223 0 121 
1 60 60 61 
2 15 80 15 
8 2 6 2 
4 1 АУЗ 1 
200 100 200 
smf 100 
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The probability with 0 trials— 
-a a? 
o 


—Rec. [AL (log 2.7183.5) ] 
—Rec. [AL (.4343Х.5)] 
—Rec. [AL (.21715)] 

—Rec. 1.649 

—.6065 


The probability of 0 trials in 200 cases 
=.6065 x 200=1.2130 


a9 


pce 


n-a. 


p 


P—=121.30X.5= na —60.65 


и р ix —15.16 


po— 2 
151605 _1.58 обо 
Bec СОТ 
ВВ Br RN Sag 
be 
Illustration —14 


In 1000 consecutive issues of the ‘Utopian Seven Daily 
Chronicle the death of centenarians were recorded, the number 
x having frequency f according to the table. 


ENG. т 2 19 вЫ 


© 


E us nm п ай ee 


Show that the distribution is roughly Poissonian by 
calculating its mean, and then the frequencies in the Poissonion 
distribution, with the same mean and the same total frequency 


of 1000. (Given e Ї 9.2231 approx.) (M. 8с. Agra) 
Solution. 'The number of cases with 0 deaths, in 1000 


puse m —х1000 


—.2231»(1»(1000— 223.1 
Similarly : 


1 
pice -1x ir 1000 


ыш SUA 
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—- “х% E я 1000 
22811. 5x 1.5x1000 


= 2х8 —250.9875 
ре“ кз х1000 
.2231Х1.5Х1.5Х 1.5.1000 
NN Deomm 
pie-*x ? dr 1000 
зул 5У1.51.5у1.5)‹1000 
SPOTS —41.60156 


pi—e-^ Xr 5(1000—14.280468 
аб 
рее x ey 1000—3.570117 
а? 
pi=e-* Xr- X1000=:7140234 
$ 
рое “ХТ X1000— 14280468 


We can now compare the observed and the expected number 
of cases in the following table :— 


a observed values expected values 

of f | of f 

0 229 | 223.10 
1 825 | 334.65 
2 257 250.99 
Б] 119 125.49 
4 50 47.60 
5 17 14.28 
6 2 3.57 
7 1 0.71 
8 0 0.14 
Total 1000 1000.53 


N.B. The difference of .53 is on account of approximation. 


If we compare the expected frequencies with the observed 
freugencies, we will be convinced that the Poisson fit is 
remarkably good. 
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Mean and Variance of Poisson Distribution 


In Poisson Distribution 
Mean-Variance—g5? 


Theoretical Questions 


1—List the chief properties of the normal distribution. Why 
is the distribution given a central place in Statistics ? 
(M. Com., B.H.U.) 
2— Discuss the various types of relationship which hold good 
in normal distribution. What is meant by area relationship in a 
normal curve and of what use it is in theory of sampling ? 


Practical Questions 
1—From а set of 19 cards numbered 1, 2, 3..... 19, one 
is drawn at random. Find the chance that its number is divisible 
by 3 or 7. 


Скай, —- ) 


2— Find the chance of drawing a king, a queen and a knave 
in that order from a pack of 52 cards in three consecutive draws, 
the cards drawn not being replaced. 

AX AX 
52x 51x50 

3—A and B stand in a ring with ten other persons. Of the 
arrangement of 12 persons is at random find the chance, that there 
are exactly three persons between A and B. (M. Sc. Agra) 


(Ans. =) 


4—An experiment succeeds twice as often as it fails. Find 
the chance that in the next six trials there will be at least four 
successes. (М. Se. Agra) 


; 2 1,2 1 )’ atl 
Hint—p— — q=—([— +a] = (555 
(Hint agra gis 726 
5—Find the chance of throwing an odd number with an 


ordinary die. (Ans. ) 


(Ans. 


6—There are 5 white and 7 red balls in a bag, what is the 
chance that a white ball is drawn and then a red, the first ball 
not being put back. 
P 35 
ue 
7—A bag contains 3 red 4 black and 2 white balls. What is 
the chance of drawing a red and a white ball, each ball being put 
back after it is drawn ? 
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(Hint—If first ball drawn is white and second red, 
9 


2 8 
The pu x g^ 


27 
if first ball drawn is red and second white 
3 2 2 
the p— a x Sem 
op б Pe mee 
joint probability— (2. + oy oF | 
8—What is the chance that а leap year selected at random 
will contain 53 sundays ? (М. 5с. Agra) 


(Hint—A leap year consiste 59 complete weeks and 2 days 
over. These two days can be 

(1) Monday & Tuesday 

(2) Tuesday & Wednesday 

(3) Wednesday and Thursday 

(4) Thursday & Friday 

(5) Friday & Saturday 

(6) Saturday & Sunday 


(7) Sunday & Monday—Hence р= 2) 


9—Three groups of children contain respectively 
3 girls and 1 boy, . 
2 girls and 2 boys, 
1 girl and 3 boys. 


One child is selected at random from each group. Show that 


the chance that the three selected consist of 1 girl and 2 boys 
13 


ш; (М. Sc. Agra) 
(Hint—p=girl from the first, boy from the second, boy from 
the third=34 X14 X34=9/no 
p=boy from the first, girl from the second, boy from 

the third—14 X14 X34 —9/,, 
p=boy from first, boy from the second and girl from 

the third—14 X14 X14 —1/,, 

Е ЕЮ I 38 

ТОҢ ват +59 Ba 

10—Estimate the probability of securing the total 5 from 
two dice thrown simultaneously. (Ans. 1/%) 

11—Given N=200, p=q=14 and n=8 find the probable: 
frequencies of different number of successes. 

(100 (p+-q)*=25, 75, 75, 25) 

12—A normal distribution has a mean of 19 and a g of 4. 
Find the percentage of the cases that fall (1) between 8 and 16 
(ii) above 18 and (iii) below 6 

(Ans. (i) 68.26%, (ii) 6.68%, (iii) 6.6894) 
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13—One purse contains 1 sovereign and 4 rupees, a second 
one contains 2 sovereigns and 3 rupees and a third contains 3 
Sovereigns and 2 rupees. One purse is selected at random out 
of these and a coin is taken out from it. What is the probability 
that it is a sovereign ? (M. Com. АПА.) (Ans. 9/15) 


14—The dice are shaken and thrown. What are the respec- 
live chances that the points will total 3, 4, 5, 18 assuming that 


the dice are unbiassed ? (Ine. Acctt.) 
1 1 1 1 
| (Ans. 16, 72 ' 86 ' 916 


15—Goddard the captain of the West Indies Cricket Team 
18 reported to have observed the rule of caling 'heads' every 
time, the toss was made during the five matches of the last test 
Series, with the Indian Team. What is the probability of his 
winning the toss on all the 5 matches ? 


How will the probability be effected if he has made a rule 
of tossing a coin privately to decide whether to call ‘head’ or 
‘tail’ on each occasion,? E 

(Ans. 1/32, No effect because events are independent) 


16—Peter and Paul play a game with two dice. Peter plays i 
first by throwing the dice together. Tf the total number of points is а 
prime number other than 2, he wins outright ; if it is even he 
throws again under the same conditions, in other cases the throw 
is passed to Paul who throws under the same conditions. What 
is the probability of Peter’s winning ? (М.А. Agra) 

(Ans. 18/95) 4 

17—Five dice were thrown together 96 times. The number 
of times 4, 5 or 6 was actually thrown in the experiment is given 
below. Calculate the expected frequencies. 
"E of dice showing 4, 5 or 6— 0 1 b ds is 2 
Observed f ae T0 10 

desir (B.A. Madurai) 


(Expected frequency aecording to Binomial Distribution— 
8, 15, 80, 30, 15, 8) 

18— Eight coins are tossed at a time, 256 times. Number 
of heads observed at each throw is recorded and the results are 
given below. Find the expected frequencies. What are the 
theoretical values of the mean and standard deviation ? Calculate 
also the mean and S.D. of the observed frequencies. 


No of heads at a throw | 0| 1| 2|] 8| 4] ë | е8 
ШУ! Are Pare} geo) УКА KR МЫРЫ ci 
Frequency | 9° | 6:130 452 |67 |56 |82 |10 ' 1 


(B. Com. Delhi) 
(Expected Frequency—1, 8, 28, 56, 70, 56. 28, 8, 1, 
Expected value of the Mean—4, S.D.—1.414 
Actual value of Mean—4.0625, S.D.=1.46) 


СНАРТЕЕ 19 


BUSINESS FORECASTING 


“АП business proceeds on beliefs, or judgements of pro~ 
babilities and not on certainties" 
CuanLEs W. Ешлот 


Meaning and Importance. The future is unknown to us. 
Yet every day we are forced to make decisions involving future 
and therefore, uncertain events. In the world of business all 
decisions involve uncertain future conditions and events. 
Businessmen who are successful in their enterprizes are often 
said to have ‘good business judgement’, implying that their 
guesses prove correct more often than those of others. Great 
risk is associated with business affairs. Al] businessmen are 
forced to make forecasts regarding business activities. Success 
in business depends upon successful forecasts of business events. 
In business or trade the importance of forecasting is so great, 
that when one enters into the business world, he really enters 
the profession of forecasting. Forewarned is forearmed. In 
recent times considerable research has been conducted in this 
field. Attempts are being made to make forecasting as scientific 
as possible. The techniques of forecasting have been improved 
to a marked degree in recent years and are applicable to almost 
every sphere of business activities. Forecasting is essential in 
order to govern business policies and actions. The problem of 
business forécasting refers to the analysis of the past and 
present economic conditions with a view to draw conclusions 
about the future course of business events. Business forecasting 
is based on observations of previous booms and depressions, 
which are outcome of interaction of a number of forces operating 
with varying intensity. The problem is to study as many of 
these forces as possible and to compile an index from the result 
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of each study. “Business Forecasting refers to the statistical 
analysis of the past and current movements in a given time- 
series, so as to obtain clues about the future pattern of the 
movements". (Neter and Wasserman.) The effective planning of 
business activities is a task which in the light of the present 
day problems and trends in economic, financial and social 
conditions with which industry has to cope is placing an 
increasingly heavy responsibility upon management. The 
technique of business forecasting has been developed to give 
a logical and comprehensive means of providing management 
with information to determine the most advantageous plans: 
which can be made within the anticipated resources of the 
business. 


Business forecasting as such is not a new development. 
Every businessman must forecast, even if his whole product 
is sold before production. In general, however, businessmen: 
largley produce in anticipation of demand, hence they have to 
forecast the amount of sales and selling price. Business 
administrators have frequently to take important decisions on 
expansion of production, increase of inventories, extension of 
credit, curtailment of loans, development of markets, reduction 
of prices, reducing the rate of interest, increasing dividends, 
issue of new capital etc. These decisions cannot be made 
offhand. Therefore forecasts are inevitable in any kind of 
business and have always been necessary. What is new is the 
attempt to put forecasting on a scientific basis Le. to forecast 
by reference to past history and statistics rather than by pure 
intution and guess-work. 


Statistics and Forecasting. Statistics are the base of: 
scientific forecasting. The problem of forecasting is essentially: 
statistical in nature. It gives recognition to the fact that the 
Science of Statistics is not only useful for studying the past 
but also for studying the present and forecasting the future. 
There are two aspects of scientific business forecasting. The 
first is the analysis of the past conditions and the second is the 
analysis of current conditions in relation to a prospective future 
trend. 


The analysis of the past conditions is done by the study 
of a time series relating to business. There may be either 
‘Dynamic Variations’ or ‘Static Variations’ in that series. 
Dynamic variations are changing with reference to time and 
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"wil be represented by an historical time series. Static 
Variations in frequency series do not change in relation to 
time but take into account certain conditions. The task of the 
statistician is to find out these variations. There are four types 
-of factors which affect a time series relating to business— 


1—Secular Trend 
2—Regular Fluctuations 
(i) Cyclical 
(ii) Seasonal 
3—Irregular Fluctuations. 


АП these four types of factors must be analysed in the 
“study of past conditions. The purpose of time series analysis 
is to isolate and measure the effect of these fluctuations on the 
composite series, so that future policies may be framed on the 
basis of indications so obtained. The general trend of the 
Series gives an idea of the direction in which the series was 
moving and what is its probable course for future. The 
cyclical fluctuations reveal the effect of business cycles or whether 
the series under study is passing through a period of boom or 
"depression. Such study is very helpful, for example, if the peak 
point in boom period is reached, there is bound to be movement 
in the reverse direction. The Study of seasonal variations 
indicate the trend for the immediate future. While analysing 
the present conditions, certain new factors must also be taken 
into account which will govern the future course of movement, 
such as new inventions, changes in design and fashions, changes 
in government's economic policy war etc. 


Besides time series analysis, there is another statistical 
basis on which forecasting may be done. It is by the study of 
frequency distribution. The frequency distribution may be— 


(a) Mono Variate—It deals with a single series. If a 
businessman has to sell a particular size of shoes he would find 
firstly the mode, then variability. He will not keep shoes which 
have no demand. 


(b) Bi-Variate—They relate to variations in two series. 
Regression equation is calculated in this case which is a measure 
of prediction. 


(c) Multi-Variate—It relates to a number of series. It 
arises where a dependant series is influenced by a number of 
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independant series and we have to find out the effect of 
independant series. 


For scientific business forecasting two things are required— 
(1) detailed information about the past movements and (2) 
Special factors affecting the movement. If, for example, a 
forecast has to be made about the prices of cotton in future 
one should have detailed information about past movements in 
the priees of cotton and various factors affecting cotton price 
at the present moment. In other words detailed figures of 
Production, mill consumption, exports and imports, should be 
collected and the government policy, weather conditions, area 
under cultivation and similar other factors should be analysed. 

An imporatnt part of the work of business forecasting lies 
in constructing index numbers or so called business barometers. 
"This is a modern device to study the trends, seasonal fluctuations, 
cyclical movements and irregular fluctuations. Business baro- 
meters facilitate various forms of business forecasting. Changes 
In production may be measured by a general index of production. 
Similarly monetary and business conditions may be measured 
by indices of business activity. But use of business barometers 
Should be made with great care. They are not sure roads to 
Success. "They have their own limitations. 


Technique of Business Forecasting. If forecasting can be 
of any assistance, then a sound and suitable technique must be 
"developed for making forecasts. 


The first method is to assemble data relating to the period 
just past and to assume that any discernible trend will continue 
into the immediate future. In the absence of any other 
information this is quite a useful method, but it rests on the 
validity of the assumption that the conditions which determined 
the trend during the period just elapsed will continue to operate 
in the immediate future. 


An alternaitve method using the same data is to examine 
‘the factors and underlying forces which have produced those 
results, Once they have been ascertained, it is upto the forecaster 
to decide whether those same factors are going to remain equally 
effective in the future, or if they are likely to change, how they 
will affect his business. 


There is another method which may be called ‘forward 
looking Statistics’. These are derived from surveys of con- 
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sumers' and industrialists' expectations and intentions regarding 
the future. In the United States such surveys are very frequent. 


The most recent and interesting development in business 
forecasting is in the field of econometrie model. This is a system 
of simultaneous equations which is so designed as to represent 
the workings of the national economy. The ‘model’ is especially 
valuable when the statistician is dealing with variables which 
are prone to sudden change i.e. in market situation where 
uncertainty is great. Its merit is that it explains why and how 
the changes in the economy or market are to take place ; the 
forecast merely shows what is the movement without specifying 
in detail the determining factors. The larger models may have 
nearly fifty equations based upon various data reflecting the 
relationship between variables such as the level of investment 
and consumer's demand, imports and internal economie activity. 
This highly specialised field is the undisputed preserve of the 
expert and only begins to be comprehensible to the mathematiciam 
who requires the services of modern electronie computers to 
resolve these complex models. 


Theories of Business Forecasting. Some of the important 
theories of business forecasting are :— 


1—Economie Rhythm Theory 
2— Action and Reaction Theory 
3—Sequence Method 
4—Specifie Historical Analogy 
5—Cross-cut Analysis. 


1—Economic Rhythm Theory—The exponents of this theory 
believe that economic phenomena behave in a rhythm order. 
Cycles of nearly the same intensity and duration tend to recur. 
“The history repeats itself” is the basic assumption of this 
theory. According to this theory the available historical data 
should be analysed into their component parts. Different types 
of fluctuations influencing them have to be segregated. Then 
a trend will be obtained that will represent a long time tendency 
of growth or decline. Then project the trend a number of years 
into the future. This may be done either freehand or by means 
of the trend equation. Multiply the trend values for a period 
of perhaps a year or two by an estimate of future cyclical 
movements, being guided by part cyclical behaviour of that 
series. The trend line is supposed to represent the normal 
growth or decline of the series. 
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Experience shows that business cycles are not strictly 
periodie and that the statistical extrapolation of cycles is not 
very satisfactory. This makes the practical application of the 
theory very difficult. Another difficulty is that an area can be 
produced by an increase in either amplitude or duration, whereas 
the businessman is primarily interested in predicting the 
turning points of a cycle. All trend lines are extended tentatively 
and if the areas below the line do not equalise corresponding 
areas immediately preceeding, it is a simple proposition to revise 
the trend line so that the areas will be equalised. "Trend lines. 
may be revised by connecting the centres of gravity of successive 
cycles by straight lines. 


2—Action and Reaction Theory—This ‘Action and Reaction 
theory is borrowed from physical sciences where it means “for 
every action there is always an appropriate and equal reaction." 
According to this theory a certain level of business activity is 
normal Whenever business activity is depressed or inflated it 
tends to become normal. Below or above normal conditions 
cannot go for long. The amplitude and duration of prosperity 
must be balanced by the amplitude and duration of depression. 


1—Sequence or Time-Lag Theory—This theory is based 
upon the conception that most of the business data have the 
lead and lag relationship. Changes in business are successive 
and not simultaneous. This is probably the method in greatest 
favour among forecasters. According to Irving Fisher, a 
given price change does not have its entire effect in any one 
month but its effect is distributed ofer a number of months 
reaching a climex after a certain period of time and then 
dwindling in importance. 


One factor that impairs the usefulness of this method is the 
Scarcity of time series on the basis of shorter periods. It, is 
argued that economic processes are interrelated. It does not 
seem logical that the cause and effect relationship which 
Supposedly surround us on every side must take a month or 
more for their development. There must be many that workout 
in a few days, a few hours or nearly instantaneously. 


4— Specifice Historical Analogy—Since all business cycles are 
not uniform in amplitude or duration, some forecasters make 
use of history not by projecting any fancied economic rhythm 
into the future, but by selecting some specific previous situation 
Which has many of the earmarks of the present and concluding 
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that what happened in that previous situation will happen in 
the present one. 


Although it is undoubtedly true that partial analogies can 
be discovered in past history one should be careful to take into 
consideration the differences as well as similarities between past 
and present situations. The differences in the amount and type 
of government intervention in economic affairs should specially 
be taken care of. 


5—Cross-cut Amalysis— This theory is based upon the 
assumption that no two cycles are alike, but that like causes 
always produce like results. АП the factors bearing upon a 
given situation are assembled and relying upon the knowledge 
of economie processes, the forecaster concludes whether the 
Situation is favourable or not. Although it is a non-statistical 
method, but it is possible to develop a statistical technique by 
assigning weights to each factor and then counting the score 
to see whether the net result is favourable or not. 


An American agency known as ‘United Business Service’ uses 
an interesting type of cross-cut analysis. Instead of marshalling 
an array of factors bearing upon a given situation, opinions 
of authorities are assembled and listed each week. А decision 
is then rendered based on the weighted opinions of these 
authorities modified by the Service's own conclusions. 


Forecasting Agencies. Business forecasting has become a 
business. There are agencies, which employ expert statisticians 
to analyse and interpret st&tistical material and publish results. 
Such agencies are very common in U.S.A. and U.K. In India 
such agencies are not found. The following are the important 
forecasting agencies of U.S.A. and U.K. 


USA { 1. Harvard Committee on Economie Research 
MNT 2. Brookmire Economic Service. 


UK 1. London and Cambridge Economie Service. 
ЖУКА { 2. Economists’ Organisation. 


The Harvard Committee on Economie Research prepares 
three indices (i) Speculation Index—based on the town 
clearings of the New York banks and the prices of industrial and 
railway stocks and railway bonds. (ii) Business Index—based 
on wholesale prices, country clearings and the production of pig 
iron and (iii) Money Index—based on the rate of 4 to 6 months 
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bills and 60 to 90 days commercial paper and loans and deposits 
at the New York clearing banks. 


Brookmire Economic Service is one of the pioneer service 
in forecasting. In addition to forecasting it also advises on 
the action to be taken. This service places utmost emphasis 
on cross-cut, analysis. There are a number of publications issued 
by this service, e.g. The Brookmire Forecaster (monthly), The 
Brookmire Purchaser (monthly), The Brookmire Counsellor 
(weekly), The Brooekmire Investor (Bi-monthly) etc. These 
publieations are very useful to the business community. The 
Service makes careful analysis of general trend, seasonal 
variations and business cycles. This is done in order to make 
an appraisal of business outlook. Population growth and 
movement, inventions, capital formation, exploitation of new 
resources ete. affect the business trend. 

London and Cambridge Economic Service consists of the 
staff of the Economics Department of the Cambridge University 
and of the London School of Economics. A number of bulletins 
are issued by the Service giving information on various branches 
of Economies and Finance. It adopts four indices viz industrial. 
shares prices, wholesale prices, the value of exported manu- 
factures and the short money rates. 

The Economist, which is an English economic journal 
publishes its own Index of Business Activity since 1933. It 
is designed to measure the changes in the economic activity 
of the U. K. in quantitative unit. It gives an approximate idea 
of the real national income. 

In India, scientific business forecasting is practically absent. 
However ‘Capital’ and ‘Eastern Economist’ construct index 
Numbers of Industrial Activity in India. From March 1961 
two dailies ‘Economic Times’ and ‘Financial Express’ are also 
published in the country. They deal exclusively with business 
affairs. It is hoped that in future such agencies will take up 
forecasting business also. The National Council of Applied 
Economic Research also undertakes research on business cum 
economie problems. It is gratifying to note that merchants of 
India are now paying due attention towards business research. 
The merchant mind in India is, at the moment, exercised 
greatly over producing more and better goods. It is a symp- 
tom of the new economic era that has dawned upon the country 
and is fast beginning to shape the entire pattern of the national 
life, Everywhere the emphasis is on business research. 
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The Economie Research and Training Foundation, establish- 
ed by the Indian Merchants Chamber to commemorate its 
Golden Jubilee is а step— perhaps, long delayed—perfectly in 
harmony with the economic activity of the country. The 
initial capital outlay of Rs. 25 lakhs indicates the 
importance that the Chamber attaches to the venture. 
Apparently, the only way to keep up with the growing 
tempo of industrial development is through systematic economic 
research. Today economic issues can only be considered in the 
light of technical knowledge. The results of meticulous. research 
should be applied not only to the solution of the economic 
problems of the present, but should also be the basis of the 
policies of the future. Methodical collection and accurate 
analysis of data on current economic conditions are a necessary 
aid to correct economic thinking and idustrial practice. Industrial 
problems are of a specialised nature and demand a specialised 
approach and expert handling. Till now facilities for a correct 
assessment of economic phenomena have been largely lacking in 
India. The main objective of the foundation is to under-take 
applied economic research and impart theoretical and practical 
training in business research. 


Its Utility to Business Community. It is obvious that as 
they stand indices prepared by these agencies are of little use 
to the individual businessman. They are concerned merely with 
general business conditions. As such they are useful to the 
business community in that they show him the position of 
business in general fluctuations le. whether in the depression, 
recovery or prosperity stage, by indicating the general course 
of business in the immediate future and by giving some sign 
of when the next major change is likely to occur. But given 
Such information the individual enterpriser is far from 
satisfied—he has still to discover the relation that exists between 
his business and the general state of trade. 


As production consists of a number of processes, although 
these processes are inter-related but fluctuations in them do not 
coincide. It has been well said that “the curve of business is 
not the same for soap and for Shipping or for foods and for 
buildings." The businessman is therefore faced with the problem 
of discovering exactly how the industry in which he is concerned 
reacts to trade conditions generally. To do this he can obtain 
sales figures for his firm (or the industry as a whole) over a 
period of years and remove from them any seasonal and secular 
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fluctuations. The resultant figures will be plotted as a curve 
to be read in conjunction with the general business curve. 1 
might be noted here that the businessman whose particular 
curva lags behind the general business curve is more favourably 
placed than that businessman whose curve preceeds that of 
general conditions. 


Use of Special Forecasts. In preparing business curves 
applicable to particular industries, a number of factors have to be 
taken into consideration. (i) Due allowance must be made for 
the secular trend of growth or decline in the business concerned. 
(ii) Attention must be given to competitive influences which 
effect the relative position of the particular firm in the industry 
as a whole. (iii) Adjustments must be made for such seasonal 
fluciiatious, as experience shows, are to be expected. 


Utility of Business Forecasting. Business forecasting is 
equally important to businessman, economist and to the society 
as a whole. Trade cycles, the very characteristic of capitalist 
society bring depression and boom periods in industry, trade, 
agriculture ete. Trade cycles increase the risk of business, 
create unemployment, induce speculation and discourage eapital 
formation. Sudden price fluctuations might upset the whole 
ealeulations of business and discourage the growth of capital 
and enterprize disrupting the whole economic organisation. 
Business forecasting reduces the risk associated with business 
cycles. It minimises the suddenness of price fluctuations and 
has a wholesome effect on the economie organisation of the 
country. If businessmen can know in advance that a period of 
depression is expected in near future, they can take precautionary 
measures to minimise the harmful effects of it. Similarly if 
boom is expected, they can prepare themselves to take the 
maximum advantage out of it. For making profit, which is the 
sole aim of businessmen and industrialists, business forecasting 
is inevitable. A businessman has to forecast the future level of 
prices and the extent of demand and his success or failure 
depends upon the correct forecasting. 

Reliability of Forecasting. Business forecasting like 
weather forecasting is only valid for the next six hours or so. 
Beyond that it is sheer guesswork. Those responsible for 
running a forecasting service on a commercial basis are naturally 
prone to make extravagant claims for its value, and this has 
caused distrust of the whole system among some sections of 
business community in America. As regards our country there 
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is ncthing like forecasting. Our business community has no 
faith in it. English research work has not for the most part 
been conducted with a view to profit, so that there has not been the 
same temptation to conceal defects, It would be unwise to make 
extravagant claims for such “guesstimates” аз they have beem 
called, but there have been a number of occassions when 
predictions were sufficiently accurate. Complete dependence upon 
forecasting 13 not desirable. The businessman relying 
completely over it, is foredoomed to disappointment. Forecasts 
are not prophesies, their function is to minimise the unknown 
factor by calculation of probable events. On ‘what has happened,” 
we find out ‘what is likely to happen.” 


The general conclusion appears to be that the forecasts would 
have proved fairly accurate but by no means infalliable guide to 
major fluctuations. All sciences in their early stages have been 
open to such defects as they were not sufficiently exact to be of 
much practical value. This however has always been regarded 
by the scientists as an argument in favour of continuing their 
researches until more definite results were obtained. It is 
clearly established that certain relationship exists but the exact 
nature of the relationship, no doubt calls for further 
investigations. In the words of Prof. Elemer С. Bratt, 
"Forecasts are statements of expected future conditions, 
definitive statements of what will actually happen are patently 
impossible. Expectation depends upon assumptions made. If 
assumptions are plausible the forecasts have better chances of 
being useful." 


Questions 


l. Discuss the important theories of business forecasting. 
How does analysis of time series help in forecasting of business 


events ? (M. Com. Allhabad) 
2. What is meant by business forecast ? Explain the major 
classes of methods used in forecasting. (M. Com. Lucknow) 


8. What is the Practical importance of forecasting in 
business ? Describe the various methods of business forecasting. 
4. "Business forecasting like weather forecasting is only 
valid for the next six hours or so", Give your opinion about this 
Statement. 
5. What is Business Forecasting ? What different theories 
9f it are known to you ? Explain clearly any two of them. 
(M. Com. Allhabad) 


CHAPTER 19 


DEMOGRAPHY 


“The numerical portrayal of a human population is sometimes 
known as ‘demography’. The population is viewed as an aggregate 
of persons represented by certain types of statistics. Demography 
is concerned with the behaviour of the aggregate and not with the 
behaviour of individuals." 

George W. Barclay 


Demography is the study of the measurements of humam 
populations. In ancient days the main purpose of statistics: 
relating to the human population in a country was initially 
military and fiscal With the growth of new industrial 
economical and political systems, the need for such statistics has 
increased tremendously. Such statistics play an important role 
in social sciences. The balance between population and resources 
is a problem which has been of great interest to economists since 
the days of Malthus. This interest has been revived today 
specially in under developed countries. Such statistics help in 
planning for Economic development. The under-developed 
countries get much of basic material in population statistics. 


Data on population are acquired periodically through censuses 
and continually through birth, death, migration, sickness etc. 
records. There are different statistical techniques for analysing 
the population data. Various rates and coefficients are calculated 
which throw light on the different aspects of the population. 
Through statistical techniques absolute results of the statistical 
data are turned into relative numbers. The relative numbers 
are called ‘vital rates’ and vital ratios. Here we deal with some 
popular rates and ratios used in connection with population 
statistics. 


Death and Birth Rates. These death and birth rates may 
be of two types (1) Crude and (2) Standardised. Crude death rate 
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is a rate of deaths per thousand during a year in a particular 

locality. Symbolically 

No of deaths in a locality іп a year 1000 

No of people living during the mid 

point of the year 

Similarly crude birth rate is a rate of births per thousand 

during a year in a particular locality. Symobolically 

‘Crude Birth rate— No of births in a locality during a year 
No of people living during the mid 
point of the year 


Crude death rate— 


1000 


Crude death rate shows the level of mortality in an entiré 
population. It is easily and quiekly computed. But there are 
‘certain limitations of crude rates. Due to these limitations they 
are not fit for making comparison. The crude death or birth 
rate for two localities may be the same but there may be wide 
‘differences in the age-composition of inhabitants. Hence we 
calculate an age-adjusted or standardised rate. 


Illustration —1 


Locality A 


Locality B 


Death Death 
Population [Deaths Rate Population |Deaths| Rate 
1 

рег 100 рег 100( 
Under 10 20,000 600 30 12,000 372 31 
10—20 12.000 240 20 30,000 660 22 
20—40 50,000 1250 25 62.000 1612 26 
40—60 30,000 1050 35 15,000 525 35 


10,000 500 50 3,000 180 60 


1,292,000 | 3640 
Crude Death Rate for Locality A. 
Total Deaths 


1,292,000 | 3349 


Total Population 1000 
1,22000 
— 3640 х1000—29.9 


It can also be calculated as— 


(30520000) + (2012000) -1- (2550000) + (8580000) 
-- (50310000) 


20000--12000--50000--30000--10,000 
_3640,000 99 9 
=D 
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Crude Death Rate for Town B 
Total Deaths 
=Total Population 
__ 3349 
— 1,22000 
It ean also be calculated as— 


(31512000) -+ (2230000 (2662000) + (35 15000) 
x - (22530000) + (26X РИ snd) 


«1000 


<1000=26.6 


120004-30;0004-620004-15000-8000 
334900066 


— 122000 
For caleulating standardised death Rate the death rate of each 
group of the locality is multiplied by the population of the each 
group of the standard population and the total thereof is divided 
by the total of the standard population. In the above example 
if we assume ‘B’ locality as standard then the standardised death 
rate of locality A will be— 


(8012000) -+ (20¢30,000) + (255«62000) + (3515000) 
x )+(20><30,000) +- (25x + X0 5,3000) 


1,22,000 
31,85,000 __ 
— 122,000 578.1 
If age composition is separately given which is to be taken 
as standard then standardised death rates will be caleulated as 
given under. 


Standard. Population 


Age Composition Population 
Under 10 150 
10—20 200 
20—40 400 
40—60 150 
Above 60 100 


In this case standardised death rate for Locality A= 
(30X150) 4-(205¢200) -+ (25%400)-- (35150) + (505100) 
1000 


28160 __ 93.75 
1000 


Locality в— 
(81X150) X. (22X200) -} (265400) + (85150) + (60X10) 
1000 
30700 зол 


—1000 
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Crude Rate of Natural Increase. Crude rate of natural 
increase of population can be found out by the foliowing formula. 
Crude rate of Natural increase— 


Total of annual Births—Total of annual deaths 
Annual mean population 


1000 


ог 
Crude birth rate—Crude death rate 


It ean also be in negative, it means that population is 
decreasing. 


Sex Ratio. There are many occasions that require a brief 
summary of the sex composition in a population. This ratio 
can be caleulated by 

M 
FXE 


Where : M is the number of males in a population 
F is the number of females in a population 


K is 100 
A sex ratio for some specific age group may be found out by 
M(18—50) 
F(16—45) ХК 


Child- Woman Ratio. Child-woman ratio can similarly be 
found out 
p(0—4) 
f(15—44) 
Where p(0—4) is the number of children under 4 years of age. 


1(15—44) is the number of women between age group 
of 15—44. 


K—100 


Density of Population, Density of population is calculated 
for varied purposes. This can be known by— 
хк 
Where pi is the number of persons in the ith part or political 
division of the country. 


P is the total number of persons in all parts. 
K—100 


Fertility rates. In demography the term fertility refers to 
the actual production of children. Fertility must be distin- 
guished from fecundity which refers to the capacity to bear 
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children. Fecundity sets an upper limit to fertility. The 
simplest way to summarise the fertility conditions of a particular 
area during a given period is to calculate the mean number of 
children which females living right through their child-bearing 
period will (on the average) bear, if they are subject to the 
fertility conditions holding in the particular area during the 
given period. Such a measure is known as the Total Fertility 
Rate or General Fertility Rate. The child bearing period of a 
female is usually taken to be the span from 15 years of age to 50. 
Total number of Births 


Total Fertility Rate— Tota miabar ОВЕ x 1000 


in child bearing age 
group (15—50) 


If a more comprehensive study of fertility is necessary the 
child-bearing age тау be divided into various groups, and 
specific fertility rates for different age groups may be found out. 
Different age-groups have different fertility rates, because 
fertility will be greater in first few years and then it will 
decrease with the advance of age. 


Total number of births to 


^ š Pn females aged (15— 25) 
Specific Fertility Rate С ог women — x1000 


aged (15—25) 


Production Rates, The total fertility rate refers to the 
number of children which a female can expect to produce. A 
more significant figure is the number of female children. For 
this will give an indication of the number of females which a 
female will produce over her life time to replace herself. The 
total fertility rate can be caleulated in terms of female births 
only by restricting the births in the specific fertility rates to 
female births. Such a calculation leads to a measure called the 
gross production rate. The Gross Production Rate measures 
the mean number of female children which will be born to a 
newly born female who is subject to the given fertility 
conditions throughout her life time but is not subj ect to mortality. 

Number of female 
births 
Number of births 


In calculating the gross production rate we have not taken 
into account the factor of mortality. It has been presumed 
that all the new born female children will survive through the 
child bearing period. But actually this does not happen. Hence 


Gross Production Rate—Total Fertility Rate 
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there is need for calculating net production rate. The net 
production rate measures the mean number of female children 
which will be born to a newly-born female who is subject to the 
given fertility and mortality conditions throughout her life time. 
In other words net production rate shows the number of future 
mothers that will be born to the present mothers according to 
present fertility and mortality. The following example will 
illustrate the caleulation of gross and net production rates. 


Illustration—1 Compute gross production rate 


Age group Total Female No. of Births Specific Fertility 


Population (Female) Rate (Col. 

22-8) 
15—20 2,89,142 9241 31.96 
20—25 3,08 ,464 51268 166.20 
25—30 3,00,889 56076 186.20 
30—85 3,00,567 39401 131.09 
85—40 2,75,687 20415 74.06 
40—45 2,88,284 5574 28.89 
45—50 2,29,347 409 1.78 
Тоїа1 19,42,3380 1,892,884 614.85 

Gross Production Rate 614.855 3.074 


This G.P.R. shows that for one present mother there will be 
3.074 in future. The population is growing. 


Illustration —2 


Compute the Net production Rate 


No. of feamle |No. of survivors|No. of surviving 


Age-group of fhildren born to| out of each |women by whom 
child bearing | 1000 women 1000 female | present women 
females passing through children replace 

each age group МУР themselves 
15—20 50 850 42.5 
20—25 180 800 144.0 
25—30 450 750 887.5 
80—35 500 700 350.0 
85—40 800 650 195.0 
40—45 100 600 60.0 
45—50 40 500 20.0 
15—50 1620 — 1149.0 


м 
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1000 _ 1149 
Net production Rate— 1149.0 555 


It shows that population is increasing 
Maternal Mortality Rate 
Annual Mat th 
, Annu a ера! dea: 5.1000 
annual births 


Crude Marriage Rate 


annual marriages 
000 


— Annual mean population 
Infant Mortality Rate 
annual infant deaths 


annual births шр 


Estimation of Population Growth. A census of population: 
takes place every tenth year. Only then accurate information 
about the population ean be known. There are various occasions 
when it is required to make an estimate of population. There 
are various ways of estimating population. А few of them 
which are used by the demographers are discussed here. 


(i) Intercensual estimate can be made if accurate data 
regarding births, deaths and migrations are available in the 
following way :— 


Estimate of Population for 1963 


Population on 30th March 1961 
Births Registerd during 1961—63 
(Deduct) Death Registerd during 1961—63 
(Add) Natural Increase 
Overseas arrivals during 1961—63 
(Deduct) Overseas departures during 1961—63 


Net Migration 
Estimated population for 1963 


Gi) Assuming that yearly changes in the size of the popula- 
tion are equal, we can estimate along a straight line for any 
intercensual year by the following formula— 

Р tgs iq O27 PD 
Where p—is the Population estimate at some date between two: 
censuses 
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P,—is the size of the Population as determined in the first 
census 


P,—is the size of the Population as determined in the 
second census 


Nis the number of years between census 


x—is the number of years between the date of P4 and the 
date of estimate 


This procedure is known as linear interpolation, because it 
follows the path of a straight line between two points. It has 
‘many applications owing to its simplicity and convenience. 


(iii) Sometimes the population at two censuses is given and 
it is required to estimate the population of the middle year. In 
that case the geometric mean of the two populations ean be 
‘caleulated. Symbolically— 


Pi VPiXPe2 
P,—Population at the first census 
P.—Population at the 2nd census А 


(iv) As population increases in а geometric progression the 
following formula is used for finding out the population of a 
particular year. This formula is known as compound interest 
formula because of its usefulness in various problems involving 
compound rate of growth. The formula is 

Pn=Po(1--r)* 
"Where Pn—Population at the end of period 


Po=Population at the begining of period 
r —rate of growth 
n—number of years. 
The r (rate of growth) can be calculated. 


— 


P 
т=п \/ Po каз! 
Illustration —38 


The population of a town increases according to the 
compound interest Jaw. In 1890 and 1940 it was 19,500 and 36,670 
respectively. Find out 


(a) The population in 1926 

(b) The population in 1945 

(c) The rate per thousand per annum of increase in 
population. (M. Com. Alld.) 
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we 19500 
=50\/1.777 —1 
AL. (Log 1.777+-50)—1 
AL. (.2497+50)—1 
AL. (.0049)—1 


=1.011—1—.011 
For Rate per thousand .0115«1000—11 
Population for 1926— For 1945 
)n—po(1--r) ^ 
=19500(1--.011)36 —=19500(1-|-.011)55 
by using Logs— =AL. of Log 19500+55 
AL. of—Log 19500+36 Log 1.011 Log 1.011 
—4.2900-1-36 X< .0048 —4.2900-|1-55Ж.0048 
—4.2900-1-.1728 —4.2900-1-.2640 
AL. —4.4628 —AL. of 4.5540 
29020 35810 
Estimated population for 1926 Estimated population 
29020 for 1945—35810. 


(v) There are a number of linear and parabola for 
estimating the population of the mid year. The most common 
linear method used in Australia is 


a-|-2b-+2c+2d-Le 
UR 8 
and parabola is 
a-|-4b-1-2c--4d--e 
12 
These trends depend upon so many theoretical grounds and 


on the phases of the growth cycles. 


Illustration —4 


LEAD EE 
Linear Method Parabolic method 
Year | Population | W WXP w WXP 


75,17,981 75,17,981 


1911 | 75,17,981 


1 1 
1921 | 75,52,694 о | 1,51,05,888 4 | 3,02,10,776 
1931 | 75,79,858 2 | 1,51,58,716 2 | 1,51,58,716 
1941 | 76,05,259 2 | 1,52,10,518 4 | 3,04,21,036. 
1951 | 76,398,028 1 76,38,628 1 76,388,628 
Total 8 | 6,06,31,231 ! 12 | 9,09,47,137 


А 


40 
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Average population— Simple method 
7517981 
7638628 
2| 15156609 
15,18,305 
Linear Method— E. —'15,18,904 


| 9094713 
Parabolic Method— 92181 — 75,78,928 


(vi) The logistic curve or the curve of population growth 
is another way of estimating population. This curve is also 
known as the Pearl-Reed curve. The curve in the simplest 


form is— 


k 
уе=т тув» 
. Where 
Zyoy1Y2-—Y^i (Уо |-Уз) 
о 
YoYo— Yi? 
k—yo 
a—lo: 
ч Уо 
1 Yo(k—y1) 
b= —Log — 
п ^? y, (Куо) 
Where yy— Geometrie mean of 3 years in the beginning of the 
series. 


yi Geometrie mean of 3 years in the middle of the series. 

ya—Geometrie mean of 3 years іп the end of the series. 

This theory advanced by Pearl is not universally accepted. 

This is highly mathematical ; hence is beyond the scope of this 
book. 


Tliustration—5 
The population of a state at 10 yearly intervals is given 
below : 


Year Population in Millions 
x у 
1881 8.9 
1891 5.3 
1901 7.3 
1911 9.6 
1921 12.9. 
1981 17.1 
1941 Ў 23.2 


1951 S 30.5 TEE 
By fitting a curve of the form y—ab to this data estimate 


the population fon 1961. (B. Com. & M. A. Delhi). 


| 
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Years x j:y | бах xXLog y x 
1881 —7 | 8.9 0.5911 —4.1377 49 
1891 =з 818 0.7243 — 8.6215 25 
1901 —8 | 78 | 0.8688 — 9.6899 9 
1911 E 9.6 0.9823 — 0.9823 1 
1921 1 | 12.9 1.1106 1.1106 1 
1931 8 | 17.1 1.2330 | 3.6990 9 
1941 5 | 23.2 1.3655 | 6.8275 25 
1951 т “1.80.5 1.4848 | 10.3901 49 

0 8.3544 | 10.6858 | 168 


The normal equations for finding А and B are 
slog y—nA--Bxu 
zu log y=A zu--B su? 
Substituting the values of gu, ete., we get 
8.3544—8A+0 
А— 1.0443 or A.L.—11.08 
10.6858—0--В X168 
B—0.0636 or A.L.—1.158 
x 2—11.08 and b—1.158 
Hence y—11.08 (1.158) 
When x—1961, 
x—1916 1961—1916 _ 


сы ui ee 


vd y—11.08(1.158)9 
i.e., log y—log .11.08--9 log 1.158 
—1.0443-1 9.0636 
—1.6167 
“Ж у=41.37. 
Hence population estimated in 1961—41.37 (millions). 


Theortical Questions 

E l. Give a brief account of what in your opinion is the most ` 
accurate method of forecasting the future trends in the size, of 
Ў ulati iven country. 

. population of any given country (M. Com. Agra) 


з 2. Under what circumstances would you adopt the arithmetic: 
< mean and geometric mean of the two census enumerations as the- 
. mean population of intercensol period ? Find a formula for the i 
_ mean population on the assumption of geometrical progression for- 
_ the increase of population. 


© 3. Explain the following :— y^ 
2 E : * DB 
Id (a) General Fertility Rate (b) Specific Fertility Rate: 
x. 


(c) Gross Production Rate (d) Net Production Rate. 


іа 
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Practical Questions 


1. The population of a town increases according to the 
compound interest law. In 1890 and 1940 it was 19,500 and 
34,670 respectively. Sketch a graph to show the population at all 
times during the period, and use it to estimate— 


(a) the population in 1926 ; 
(b) the population in 1945 ; and 


(c) the rate per thousand per annum of increase in 
population. (M. Com., АПа.,) (M. A., Raj.) 


(Given in this chapter as an illustration) 


2. Determine which of the town A or B is more healthy ? 


A B 
Age Population Deaths | Population Deaths 
0—15 15,000 360 20,000 500 
15—50 20,000 ` 400 52,000 1040 
Above 50 5,000 140 8,000 240 
40,000 900 80,000 1780 


(B. Com. Agra) 
(General Death Rate of Town A—22.5 
j P », В—22.25 
Standardised Death Rate of Town B—23.125 
A Town is healthier than B) 


3. The mortality data for two towns А and B are given below. 
Which of them would you consider to be more healthy and why ? 


т —— 


Age [$$ 
Population Deaths Population Deaths 
0—5 8,000 185 2,500 65 
5—40 25,000 125 13,000 78 
40—75 60,000 420 31,500 252 
Оует 75 7,000 480 3,000 210 
MUS. uo ES lo bee ВЕ O Е С, 
1,00,000 1210 50,000 605 


(Crude Death Rate for both=12.A. The Standardised Death 
Rate for town В taking A as the standard— 13.28.) 
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4. The deaths of two towns A and B are given according to 
the age groups, and you are asked to compare the health conditions 
of two towns. 


Town A Town B 
Age group 
Ро | ~ Population | Deaths. Population | Deaths 

Under 5 ^. 25000  . 550 10,000 220 
5—15 40,000 280 15,000 105 
15—85 60,000 720 20,000 240 
Over 85 15,000 525 15,000 525 
Total Total | 14000 | 2075 | | 2,075 60,000 | 1,090 


(B. Com. Agra 1959) 
(Crude Death Rates for town A—14.8, B=18.1, Standardised 
Death Rate for town B—14.8) 


5. Compute from the following statistics of population and 
unemployment for the standard and local populations (i) the 
general unemployment rate for the standard population (ii) 
Standardised unemployment rate for the local population and 
(iii) the crude unemployment rate for the local population. 


Standard population Local population 
ре "cru — 

3 Population| Unemployment | Population] Unemployment 
15—30 2500 Rate % 3000 Rate % 
80—45 8500 5 3000 4 
45—60 3000 8 8500 9 
60 апа 1000 12 500 12 

above 
10000 


(General unemployment Rate for Standard population— 
9.15%, Standard Unemployment Rate for the local population— 
9.75%, Crude Unemployment Rate for the local population 
=9.10%) 


СНАРТЕЕ 81 | 


-NATIONAL INCOME AND 
SOCIAL ACCOUNTING 


"National income may be defined provisionally as the net 
total of commodities and services (economic goods) produced by 
the people comprising a nation, as the total of such goods received. ..— 
by the nation's individual members in return for their assistance 
in producing commodities and services as the total of goods 
‘consumed by the individuals out of the receipts thus earned or 
finally as the net total of desirable events enjoyed by the same 
individuals in their double capacity as producers and consumers. 
Defined in any one of these fashions, national income is the end 
product of a country's economic activity ‘reflecting the combined 
play of economic forces and serving to appraise the prevailing 
economie organization in terms of returns." 


Simon Kuznets 
(Encyclopaedia of Social Sciences Vol. XI p. 208) 


Prof Colin Clark remarked that, “comparisons of economic 
welfare between one community and another, one economic group 
and another and between one time and another are the frame- 
work of Economic-Science. Anything which can be done to . 
promote the scope and improve the technique of such comparison 
is of fundamental importance."! The national income statistics _ 
may be said to be the index numbers of the economic progress -- 
of a nation. Because these are the measures which make 
comparison between one nation and another, one economic group 
and another and between one time and another possible. These 
statisties provide a technique for such comparisons. 


The need for the measurement of national income was felt — 
during the depression of the nineteen-thirties. The more 
government interfered in the national economies, the more 
statistical research was stimulated. 'The establishment of а 


1 Conditions of Economic Progress, p. 16. 
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quantitative picture of the structure of the national economy 
was tried. The most useful and appropriate way of showing 
the structure of the national economy in quantitative figures 
may be regarded the statistics of national income. Let us 
consider the uses to which the statistics of the national income 
may be put. 


The main purpose of national income estimates is to provide 
a summary picture of the conditions of an economic system or 
an exhibition of the value of non-human resources available for 
its use, to postray the changes in the stock of wealth and to 
set-forth the values of goods and services produced by the 
economic system during the period under consideration and to 
indicate the various distributive shares going to families and 
individuals for services of their labour and property. 


National Income estimates provide the best single measure 
of the nation's well-being or economic progress. As an indivi- 
dual's income is an index of his standard of life so also the 
national income of a country indicate the standard of life which 
it ean support for its people. “Since the end product" says 
$. Kuznets, “of each country's economic system is an index of 
its producing power, national income estimates furnish a 
comparison of the productivity of nations, per capita income 
figures especially when adjusted for differences in purchasing 
power of money appear to measure the nation’s economic 
welfare.” 

A continuous series of annual estimates of national income 
would suggest the trend of the economic growth of the nation 
and how rapidly this is taking place. But it should be noted 
that national income gives measurement of only total volume 
of goods and services and does not tell about the way in which 
this total amount is divided amongst the poor and the rich. 
These estimates take a major part in the analysis of the 
economie situation and in the formulation of economic policy. 
Though the field of enquiry is highly technical but it occupies 
an important position in the world of practical affairs. The 
economie policy of a nation is designed to assure the livelihood 
of its people to prevent or mitigate declines in the nation's 
material produet and to encourage vigorous economie growth 
securing economie justice in the country and peaceful relations 
with the rest of the world. This cannot be done without proper 
statisties of mational income which are very useful in the 
economie planning of a nation. On the other hand the com- 
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parisons of the economie growth of the various nations at the 
same time or of опе nation at several stages may be made with 
the help of the data of national income, 

An estimate of the national income is the necessary 
begining of a serious economic study of any country. What 
did we know of the economic state of India before investigations 
gave us reasonable estimates of income per head? Тһе 
problems of full employment, equi-distribution of income and 
wealth and a like have to be discussed in terms of the concept 
of national income. 


It cannot be said that the task is technically very much 
more difficult than many of the tasks of the analysis of the 
population census which have been so admirably performed in 
almost all the countries. But being summary and appraisal of 
the national economic conditions, national income demands 
statistical measurement. No doubt the statisticians following 
any of the ways towards computation of national income face 
certain technical difficulties, because the statistical information 
is seldom sufficient and must be supplemented by estimates and 
even by guesses. 


Ways of Computing National Income 


à The naticnal income can be measured from-four points of 
view. In other words the national income can be counted at 
four points viz. either at the point of produetion or at the point 
of consumption or at the point of distribution or at the point 
of these activities being recorded in accounting system. These 
methods may be called— 


(1) The Net Output Method 

(2) The Income Distribution Method 
(3) The Expediture-cum saving Method 
(4) The Social Accounting Method. 


(1) The Net Output Method—National income may be 
computed by adding together the nation's output of goods and 
Services. This method is also known as ‘Census of Production’ 
method. This is called ‘Net National Output’. It is arrived at 
by computing the net value of each industry's output of goods 
and services. By net value is meant the total selling output of 
goods and services less the value of those goods and services 


NATIONAL INCOME AND SOCIAL ACCOUNTING 633 


which are purchased from other industries or from abroad and 
depreciation and capital goods used in the production of those 
goods and services. It involves the grouping of the national 
income according to the industries. There are certain 
advantages of this system. It gives in a general way the idea 
regarding the following facts :— 


(a) We may know in a general way what industries have 
been most depressed. 


(b) It will show not only the essential characteristics 
of business fluctuations but also the more gradual 
change in the character of the economy. 


(с) It is urgently needed which studying the productive 
capacity of a nation. 


(d) It emphasises the fundamental aspect of the economic 
system and provides a co-ordinated view of the 
national economy. 


This method is widely used in under-developed countries. 
This system is considered less reliable due to larger margin 
of error. 


(2) The Income Distribution Method—Secondly the national 
income may be computed by adding the incomes obtained from 
economie activity of the country’s inhabitants. This is regarded 
аз the ‘Income Method’. A national income table constructed 
for this method records the distribution of income amongst the 
various kinds of income receivers in the form of rent, profit, 
interest, wage and salary. Income can also be classified according 
to occupation, nationality and domicile of the earners. This. 
method is being used in United Kingdom. The data for this 
method of estimating national income are gathered from returns 
of income tax, sample surveys of income etc. 


(3) The Expenditure Cum saving Method—In the third 
place national income may be measured through the ways of its 
expenditure. This method is derived from the principle that 
the people of a nation receive their income from one source or 
the other. They can do one of the two things with it. They 
can use it either to satisfy their immediate needs or in other 
words they can consume it or they can invest it by postponing 
their present consumption. Hence net national expenditure 
consists of the sum of the two sets of activities concerned with 
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the ultimate disposal of income. The first of these is the total 
of goods and services immediately consumed in the country and 
second includes all the ways of holding wealth which is not 
immediately consumed. "This method compile data mostly from 
family budgets. It has the greatest margin of error. 


The following table shows the simplest case of the income 
of a nation measured from ten above three angles of views :— 


Net National 


Net National Output Net National 
Expenditure 
Income . Net Output of 1. Expenditure on 
1. Rent Agriculture| ^ goods and services 
2. Profits . Net Output of Mining| ог current 
3. Interests . Net Output of consumption 
4. Wages Manufacturing| 2. Net. Investment 
5. Salares 4. Net Output of 
Distribution 
5. Net Output of 
"Transport 
6. Net Output of other 
Services 
"Total Net Total Net Output Total Net National 


National Income Expenditure 


There are two precautions which should be kept in mind 
while computing national income by any of the above ways. 
The first is that care should be taken in each of these caleula- 
tions for allowance which must be made by reducing the value 
‘of, the capital equipment occuring during the year. This is 
"regarded as depreciation allowance. Secondly in order to caleu- 
late net national income or net national output or net national 
expenditure counting of any part of the national product more 
than once should be avoided. This means for example that the 
value of raw materials and services which enter into the cost 
of production of a given industry must be excluded in measuring 
the value of its net output, as they will appear elsewhere in the 
same column under their. appropriate industry headings. The 
value of sugar cane used in a sugar factory should be excluded 
in agriculture, because it is a part of manufactured sugar which 
will be computed in the industrial production. 


There are certain advantages drawn by computing national 
income by above three systems if used simultaneously. This is 
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due to these advantages that certain countries particularly 
U.S.A. use all the three systems. 


The first and the foremost advantage is that each of three 
approaches in so far as it is based on independent and distinct 
caleulations sets out the same national income although the data 
are differently arrived at undifferently classified. Hence each 
of the three totals constitute a check on the other. 


In reality it has been seen that totals are not the same due 
to lack of statistical data somewhere. But still it has been 
found that there is very high degree of correlation between 
three totals, derived from three types of sources. For example 
the coefficient of correlation between the national income and 
national output of the U.S.A. over the period of 1929—1941 
is ‘0.996’, which is certainly very high. The second advantage 
is that the table which emerges from the triple calculation 
presents conveniently tabulated form most of the information 
in this field normally required for the formulation and inter- 
pretation of economic policy. The third advantage is that it 
reduces the danger of double counting which may arise when 
national income is computed from one point of view only. 


(4) The Social Accounting Method—It has been recently 
realised that the totals of the national income can serve much 
more general good and can be of much use if they provide 
information regarding the structure of the constituent transac- 
tions. The word ‘Social Accounting’ was first used by 
Prof. J, В. Hicks in his book ‘Social Framework’. 


National income statistics underwent a basic transformation 
during the past two decades and particularly during the last, 
The essence of this change was a shift of emphasis from 
measuring total national output to providing а statistical picture 
of the economie processes and structure. 


The Nature of Social Accounting. Businesses keep records 
of their transactions in the form of accounts. Amongst other 
things these accounts provide information upon which the 
business enterprises can base policy decesions. They provide 
a continuous commentary on the affairs of business concerns 
and are essential to management in controlling efficiency, in 
ensuring financial stability and а satisfactory rate of profit and 
in facilitating balanced growth. just as there.are many reasons 
Which make it necessary for individual businesses to keep records 
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of their transactions, so it is most important to have records of 
the transactions which take place in the national economy as a 
whole. These records are called ‘social accounts’. According 
to Prof. J. R. N. Stone, "Instead of seeking to build up a 
single total such as the national income an investigation is first 
made of the classification of accounting entities of the type of 
accounts that they keep, and of the transactions into which they 
enter. In this way all the transacting entities of an economic 
system are classified into broad sectors such as productive enter- 
prizes, financial intermediaries and final consumer and a series 
of accounts for each of three sectors is set up in which seperate 
entries represent economically district categories of transactions. 
Economic activity is represented by money flows and related book- 
keeping transactions actual or imputed between accounts. 
The national income and other similar aggregates are obtained 
from the system by selecting and combining the constituent 
entries in the accounts." The development of the social 
accounting technique is due to Prof. R. N. Stone of Cambridge 
University, Messrs Harold C. Edey and Alan T. Peacock observes, 
“Social accounting is concerned with the statistical classification 
of the activities of human beings and human institutions in ways 
which help us to understand the operation of the economy as 
a whole.” 


The concept of social accounting is unfamiliar to most 
people and it cannot be understood by them with ease. Therefore 
a more understandable as well as more illuminating view of the 
subject can be obtained by starting with simple accounting 
entities showing their relations with each other, This can easily 
be understood if we take a case of income and expenditure 
account of an individual family. On the receipt side of the 
account the payments received in the form of wages or salaries, 
together with interest or dividend of the concern in which the 
family hold interest will appear. On that payment side it will 
part with cash to the persons who supply the family’s needs. It 
will also pay certain taxes to the public authorities. After the 
current payments have been met a certain residue, positive or 
negative will remain which may be termed as saving and will be 
transferred to the family’s capital account in which it is held 
either in the form of cash or investment. 


1U.N.O.—Report on Statistical Methods No. 7. 
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In this system each transaction will be recorded twice at the 
receiving and the paying end. No doubt the system is too much 
complicated for practical purposes, because it has been assumed 
that each single transaction between each single accounting 
entity is seperately recorded. Such an amount of detail will 
be quite unmanageable and means must be found out of reducing 
its bulk. This can be done in two ways. First by combining 
accounting entities and second by combining transactions so as to 
group these transactions into a manageable number of classes. 
Thus the the main problem which arises while setting out a 
system of social accounting is to know how far to go in combining 
accounts and transactions. А course must be found out between 
the unmanageable details of two may accounting entities and 
types of transactions and the lack of information which results 
when the process of combination is carried too far. There are 
two problems which are to be faced-first are millions of individual 
transactions which take place in the economy in one period and 
we have to reduce this mass detail to one single transaction to 
fit in a combined account. Because so much detail cannot be kept. 
Secondly we have to reduce all that is taking place in the economy 
to one single figure. 


The working out of a useful and manageable system of social 
accounts is a matter which must be left in detail to the combined 
experience of those whose duty it is to compile the records for 
such system in statistical offices all over the world. At present 
United Nations Studies And Reports on Statistical Methods No. 7. 
divides the economy to be represented into five sectors, four 
within the economy under investigation and one covering the 
rest of the world. They are termed as productive enterprises, 
financial intermediaries, insurance and social security agencies, 
final consumers and rest of the world, Public authorities will in 
most economic systems of the world be represented in all five 
sectors. Each of the secors keeps more than one account, They 
must keep at least two accounts current and capital account. 
The report suggests four accounts to be kept (1) operating 
account (2) current account (3) capital account (4) Reserve 
account. \ 


Next the process of drawing up the accounts is simple. Each 
account must show its receipts from the payments to each other 
account in respect of each seperate type of consideration. In 
this way each entry in one account has its counter part on the 
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opposite side of another account. Every entry appears twice and 
there are no loose ends unaccounted for. Each account is to be 
divided into two halves, one receipts side and' the other payments 
side. А feature of this system is that each entry in any account 
will relate to one and only one type of transaction so that the 
accounts as they stand provide all the information required and 
the items of which they are composed do not require further 
sub-division. 


It will be convenient to illustrate these items by means of 
а simple example, so as to bring out the main features of such 
а system. In the following example which has been reproduced 
from the Memorandum of Prof. Richard Stone, a closed economy 
consisting of only four accounts and three considerations has been 
considered. The four accounts are lettered as a, b, c, and d and 
the considerations are numbered as 1, 2, and 3 and they will 
relate to goods and services and cash and other financial claims 
respectively. 


Producers 


(a) Business operating and appropriation accounts. 


Receipts 


Sale of goods and services to ; 
Ca 2 persons 4,000 
da 2 Public authorities 175 
Business on :— 


aa 2 Revenue account — 40,000 
ba 2 Capital account 50 


da 2 Subsidies 25 


44,250. 


(b) Business capital account. 


cb 3 loans from persons 100 
ab 1 saving (Transferred 

from appropriation 

account 250 


350 


Payments 


Payments for goods and services 
to ac 2 persons 3.000 


aa 2 Business 40,000 
ad 1 Indirect Taxes 1,000 
ab 1 Savings (transferred 

to capital account) 250 


44,250. 


Payments for goods and services 


to 
be 2 person 150 
ba 2 business, 50 


bd 3 public authorities 150 


—— 
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be 2 Business capital 


ab 1 Business saving 


ac 2 Business revenue 


dc 2 Public authorities 


Consumers 
(c) Personal Revenue Account. 
ce 1 Gifts from persons 800. сс 1 Gifts to persons 300 
dc 1 Transfer payments ca 2 current goods ә 
from public services "4,000 
authorities 550 cd 1 Direct taxes 500 
Earnings from work or cb 3 Saving. 100 
property. received from ` 100 
ac 2 Business revenue 3000 
be 2 business capital 159 
de 2 Public authorities 900 
4,900 4,900 
(d) Publie authorities Revénue account. 
cd 1 Direct taxes from Payments for goods and services 
persons 500 to 
ed 1 Indirect taxes 1000 dc 2 persons 900 
bd 3 loans from da 2 Business 175 
business 150 da 1 subsidies 25 
de 1 Transfer payments 
to process 550 
1,650 1,650 


It is not difficult to think of every day transactions that 
have no place in the above simplified example. In the above 
example a little consideration of the four:accounts show that the 
national income at factor cost is as follows :— 


Personal earnings from work of property received from :— 


3,000 
wis 150 
б 900 
$ 250. 

4,300 


Corresponding and exactly equal to the national income is 
= the national expenditure. 


а 
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ca 2 Personal expenditure on current goods and services 4,000 
Expenditure by public authorities on :— 


da 2 Goods T. g^ am 175 
de 2 Services d ты СП 900 
Capital formation :— 

ba 2 goods E. са ^s 50 
bc 2 services be Т. M 150 
da 1 Subsidies га c T 25 
ad 1 Less indirect taxes — .. Ят: .. 1,000 

4,900 


NATIONAL EXPENDITURE 


This equality of national income and expenditure is due to 
the balaneing property of the four combined accounts. 


Previously the general ideas of social accounting has been 
diseussed and with the help of the above example it is showed 
as to how national income estimates can be built up from the 
system of accounts. Now let us discuss some of the detailed 
problems regarding the kinds of accounts and accounting entities 
to ba kept for the purpose. In a closed economy the accounting 
entities may be grouped conveniently into five sectors, productive 
enterprizes, financial intermediaries, insurance and social 
Security agencies, final eonsumers and the rest of the world. In 
this list the publie authorities are not mentioned for the reason 
that each of the internal sectors may be divided into private and 
publie authority sphere. 


The second sector comprises of the banking systems, private 
agencies providing financial facilities such as discount and 
acceptance houses, building societies, hire purchase companies, 
savings bank and investment trusts etc., 


The third sector, in private sphere is composed of all forms 
of insurance and assurance companies and societies. Tt is also 
argued that certain charities where object is to provide financial 
relief should be included in this category. In fact many charitable 
organizations provide not only services such as the maintenance 
and care of children but also financial assistance. In the public 
authority sphere this sector comprises of all social security funds 
and other provisions for relief and assistance which is nof 
provided by means of a common service, During war time other 
aspects of the government activity in this sphere have made 
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their appearance not only in the field of war risks and war 
«damages insurance. 


In the sector of final consumers on private side are 
individuals, but not profit making concerns like colleges etc., which 
may for the simplicity be put in the same category. On the 
publie authority side all that part of government activity which 
is concerned with the provision of common services such as 
education, publie health or defence may be included in this sector. 


The rest of the world sector may be presented in one single 
account for the reason, from the present point of view we are 
not interested in the details of the transactions outside the 
country under investigation. 


Let us consider the nature of the entities with which we 
have to deal. In the first place by the total product of an 
economy or a country it is usual to mean one of the two things, 
either (1) the geographical product that is the value of goods 
and serviees of all kinds produced in a certain territorial area 
excluding of course the value of any component goods or services 
imported into that area or (2). the product of the factors of 
production possessed by the normal residents of that area. The 
main difference between these two concepts is that the 
previous excludes and the later includes the net income accruing 
to an area from the overseas investment of its normal residents. 
In the case of backward economies the concept of geographical 
product is useful for most purposes. 


A productive enterprize is an entity in which a number of 
factors of production and intermediate products and services, 
are brought together for the purpose of producing, usually for 
gain one or a number of goods and services. The contribution 
of an enterprize to the national income is measured by taking 
the sum of the payments it makes to the factors of production 
it employs plus its operating surplus after allowing for deprecia- 
tion, obsolescence, bad debts and other similar charges. The 
operating account of an enterprize shows the receipts cost and 
Charges and surplus arising from the activity of the period. 
At the same time there are other receipts accruing to the 
enterprize which are available for distribution e.g. investment 
income and realized capital gains ete. On the payment side of 
this aecount are shown dividends and withdrawls taxes and a 
balance to be transferred to the reserve account and 
represents an addition to business savings or undistributed 

Aq 
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profits. Receipts of capital funds from other sources such as 
borrowing the sale of assets and claims and allowances for 
depreciation etc., are shown together with business saving on 
the receipt side of the capital and reserve accounts while the 
other side shows the payments for capital equipment and 
inventions and for investments and claims of all kinds. 


The second sector viz. the financial intermediaries—the 
banking system being the most important single element is partly 
owned privately and partly owned by public authorities. The 
financial intermediaries require special treatment in view of the 
different functions they perform and method of receiving remu- 
neration for their services. If we treat banks like ordinary 
business enterprizes we have to show as their sale proceeds simply 
their charges to the customers, as а consequence a deficit rather 
than a surplus would appear on the other side of the operating 
account. This is clearly unsatisfactory. Оп the other hand we 
may credit the interest received by banks to their operating 
account but exclude deposit interest from our calculation of the 
income generated in banking. The effect of this will be to 
include in the national income all interest paid out by concerns of 
all types with the exception of deposit and similar interests paid 
by financial intermediaries together with all wages, salaries and 
operating surpluses whether arising in financial intermediaries 
or elsewhere. 


These difficulties can be removed by the following procedure. 
An income is imputed to bank depositors for the use of their 
money equal to the excess of interest and dividends 
received by banks over interest paid out and this income is 
assumed to be used in paying for uncharged banking services. 
In the case of persons this imputed income and outlay appears 
on the either side of the revenue account, but in the 
case of enterprizes of all kinds the imputed outlay is charged 
operating account. In dealing with the banks themselves it 18 
convenient to credit the appropriation rather than the operating 
account all interest and dividends received as in the case of other 
enterprizes and to debit the appropriation account not only with 
the dividends and taxes but also with deposit interest and the 
imputed income of depositors, The imputed outlay on banking 
services is of cource credited to the operating account of banks: 


insurance 


The third sector comprises of all forms of 
vernmen 


undertakings pension funds, socia] security funds and g0 
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agencies dealing with assistance and relief. It may seem 
strange that such entities should be segregated and grouped 
together. They may very well be organized in the form of produc- 
tive enterprizes. The insurance companies like financial 
intermediaries are financed from investment income and 
payments for the services which they render. But there is also 
an additional complication due to the fact that the payments 
made by policy holders as premiums, comprise in part payment 
for services and in part saving. These considerations provide a 
reason for keeping insurance and social security agencies 
separate from other types of economic activities. 


In the accounting system of such entities revenue, operating 
appropriation, capital and reserve account of insurance companies 
and societies are shown. The revenue account is itself subdivided 
so that transactions with different types of policy-holders are 
kept distinct. Into these revenue accounts are paid not only 
premiums but also imputed charges equal to the investment 
income accruing in respect of business done with the different 
distinct classes of policy holders. The opposite side of the 
revenue account shows the payments in respect of claims and 
surrenders transfer to reserve in respect of the increase in 
aceruing liability and a transfer to operating account of the 
balance which represents the total contribution of the different 
classes of policy holders whether from premiums or from the 
investment income of the insurance companies themselves 
towards the cost of conducting insurance business. 


The receipts of the operating account are made up entirely 
of the transfers just mentioned. The payments are similar to 
those appearing in any other type of business. 


The appropriation account shows the surplus from the 
operating account together with the interests and dividends 
received by insurance companies, on the payments side the 
imputed income to policy holders appears along with dividends 
and withdrawls, direct taxes and transfers of surplus to the 
capital and reserve account. 


Social security funds operated by government agencies being 
non profit making can conveniently be set up from an accounting 
point of view in a simple way. The contributions to such funds 
are compulsory and for this reason are frequently regarded as 
taxes. Second, special contributions out of revenue are often 
made by government agencies. They are operated not-for profit 
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basis and are in no sense commercial in character so that they 
show neither a profit пог a loss in addition to their operating 
expenses. They may or may not be operated on an actual basis, 
but they have the financial strength of the government behind 
them and may readily develop negative reserves after for 
example, several years of severe unemployment. Unemployment 
funds in particular, tend to act as a powerful stabilizing 
influence since they tax and save in good times and dissave and 
distribute purchasing power in bad times. 


The fourth sector comprises of final consumers whether 
persons, non-profit making bodies or publie authorities providing 
соттоп goods and services. It will be seen that in the case 
of the persons the accounts could be treated on the pattern of а 
productive enterprize. The operating account of an individual 
would show on the receipt side payments for services rendered 
together with the value of receipts in kind pension payments 
by employers cash allowances ete. The payment side would show 
the costs incurred in rendering the services in question and the 
individual's operating surpluses. 


Public authorities in their principal economic aspect as the 
organizers of common services are also treated as final consumers 
rather them as producers. This is statisfactory from some 
points of view. Since they are the last in the chain of economic 
transactions leading upto these services and it is clearly 
desirable in connection with economic policy that government 
transactions should appear separately. у 


The fifth sector, the rest of the world being the last account 
brings together the loose ends remaining in all the preceeding 
accounts. It is not an independent account. It is assumed that 
we are only interested in the transactions between the rest of 
the world and the country studied and not in transactions taking 
place within the rest of the world. 'The account may therefore 
be presented in a consolidated form, 


The socia] accounting approach has got certain advantages. 
The chief advantage being that it exhibits the economic picture 
of the nation in the form of accounts. The presentation 
of accounts can be understood by those who are not statisticians 
but have a general understanding of the structure of the national 
economy. 1+ indicates the cross checks which arise from the 
inter-relatedness of sets of transactions and thus make possible 
the most difficult use of the statistical information available. 


NATIONAL INCOME AND SOCIAL ACCOUNTING 645 


It has thus an advantage that it offers in a concise from a 
programme for research. By studying the figures it is easy 
to see where statistical information is lacking. It also helps 
in stimulating statistical enquiries where they are really needed 
and provides the ground work for a systematic collection of 
information. It provides a meeting point for economie theory 
and practical measurement. 


PROBLEMS OF SOCIAL ACCOUNTING 
A 


(In Advanced Countries) 


In recent years the study of the so-called social accounting 
has been particularly stressed by economic statisticians and 
various methods of presentation of such accounts are used. 
There are certain problems in the way of construction and 
presentation of such accounts, which even the highly industria- 
lized countries have to encounter, where the statistical reporting 
agencies are very much efficient. 


Social accounting may be considered as a general scheme 
covering all stasisties of the various sectors of the economy 
bringing them together under one unifying principle. It сап 
do much to promote the unification to the degree of unity that 
should be aimed at. It cannot be denied that statistical data 
are being compiled for extremely divergent purposes and that 
consequently divergence of definitions and methods is to а certain 
extent unavoidable. The usefulness of many statistical sources 
would be for too much restricted, if they are brought under one 
rigid system of definition and classification. Yet experts will 
agree that progress as regards the mutual comparability of 
economic statistics seems desirable. 


Social accounts being largely based on available statistical 
sources the problems coming therewith may be considered. 
For example the income statistics as compiled by fiscal authorities 
cover only assessed income and assessable profit. They do not 
include therefore reductions authorized by law and they may 
include amounts which from an economic point of view do not 
constitute income e.g. gains from capital transactions. The 
social accounting system demands data on true income and actual 
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profits. Additional evaluations must be carried out in order 
to derive from the official statistics the data actually needed. 
These estimates may include such difficult items as the evalua- 
tion of total income below the exemption limit of the income-tax 
and the total evasion from income tax which is a very difficult 
task indeed. 


There are numerable sources of information many directly 
associated with administrative requirements which to a greater 
ог less extent are relied on in all countries in building up 
estimates of national income and expenditure. Though un- 
avoidable the use of information from such sources presents 
many difficulties. There is serious danger of bias in all such 
cases and this is all the more important since its extent can 
really be assessed. Income tax data illustrate this difficulty 
very well. 


Next incomparability of statistical sources gives rise to a 
number of problems. The classification of products in the census 
of production may differ widely from classification used in trade 
statistics and they may be expressed in quite different units 
(e.g. leather is measured in either weight or in square measure 
timber may be in weight or in volume ete.,) Another problem 
arises from the divergent purposes for which goods may be 
consumed. Only in rare cases do statistics make possible a 
classification of goods by various uses to which they are put. 
In most cases evaluations are necessary. Estimates have to be 
made e.g. about the share of consumption of household in total 
coal consumption and similar 13 the case with gas, electricity etc. 


Another group of problems arises due to lack of uniformity 
in accounting practice. Such items as depreciation allowances, 
reserve for replacement, valuation of stock of raw materials, 
and finished product, the book value of existing plant and 
equipment may be based on highly divergent principles and may 
greatly vitiate the compilation of adequate statistics. And 
further the treatment of households as a branch of activity 
would necessitate a number of imputations—for example 
services rendered by durable goods and individuals. 


Bias is a particularly dangerous form of inaccuracy since 
with biassed statistics margins of error cannot be interpreted 
in the ordinary way and one cannot expect the usual cancellation 
of errors that occur when these are sampled at rendom. This 


ЧИ 


NATIONAL INCOME AND SOCIAL ACCOUNTING 647 


difficulty is particularly great when a number of components of 
a given total are based on a common biassed source. These 
difficulties can be solved by the greater use of sampling method. 


In such a state of affairs when statistical data suffer from 
defects Prof. Richard Stone’s suggestion which may be accepted 
by the experts on the subject is as follows :— 


“We should complete as much of the picture as possible 
from sources such as a census of activity in which it might be 
supposed that the definitions conform closely to what is needed. 
If the greatest possible use had been made of these sources we 
should then turn to administrative statisties and piecemeal 
information available from other sources public and private, and 
should consider in each case the reliability of the estimate of 
some component in the social accounts as obtained from these 
sources. We should almost certainly find a wise range of 
reliability although we should have to recognize that the 
assessment of the reliability was to a large extent subjective. 
In some cases we should find that we had little or no information 
at all and could therefore in the absence of new enquiry do not 
more than guess at a suitable figure. Having reviewed the 
situation in this way we could then attempt to use our sampling 
organization to obtain estimates first in those cases for which 
little or no information was available and second in all those 
eases in which the estimated reliability was lower than the 
yeliability of the sampling procedures. In this way we should 
make the best use of the various sources and procedures 
available and in the course of this work we should have performed 
a useful survey of statistical sources in relation to the require- 
ments of social accounting which would be helpful in indicating 
the fields in which official statistics stand in most urgent need 
of improvement.” 

B 


(In Backward Countries) 


The difficulties which a backward country has to face in 
the estimation of national income through the system of social 
accounting procedure are of two types, one is conceptual and the 
other statistical. If the social accounting and its economic 
analysis is to be utilized for the purpose of formulation of 
national policies, the proper solution of these problems is 


necessary, 
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It is true that for some sectors of a backward economy it: 
is impossible to draw up social accounts on the basis appropriate: 
to a more advanced economy. The conceptual problems which 
are obstacles in the way are due to the nature of the backward 
economy. The bulk of the commodities and services produced 
in а backward economy are not exchanged for money as in the 
‘ease of advanced economy. А considerable portion of such 
output does not enter at all in the market. Either it is consumed 


"by the producers themselves or bartered with others' produce 


or services. This is due to subsistence nature of the economy. 
Thus the problem of imputation of such output and services 
arises and presents a difficulty in the way. This problem may 
be solved by dividing the national accounts into ‘monetary’ and 
‘non monetary’ sectors. | 


The difficulty of the measurement is further enhanccd by: 
the fact that many producers have no idea either of the quality 
or value of their product. This is due to the illiteracy of the 
masses. In more advanced countries economic statistics are 


collected directly from the individuals who generally keep their 


accounts of their economic activities. In backward countries 
accounts are not kept by the people unless they are required 
under law. Thus it presents the difficulty in obtaining data of 
production and consumption of the people. The next prcblem 
is that of dearth of technical personnel which is due to mass 
illiteracy and semisubsistence nature of their economic activity. 
Thus an element of guess work enters into the assessment of 
output specially in the large sector of the economy which is 
dominated by the small producers or the household enterprizes. 


Then again the backward economy is distinguished by the 
comparative lack of differentiation in economie functioning. 
Most of the people perform functions simultaneously and without 
differentiation which would normally fall under different 
industrial categories. For example agricultural producers 
pursue other occupations not relating to agriculture in their off- 
Season. Hence the customary classification of national income 
by industrial origin саппо be taken except as a rough approxi- 
mation to a classification of distinct groups in the population, 
whose main income is derived from a single sector. 


A great deal of analytical work remains to be done on 
question of definition and classification with regard to the 
problem of measurability in national income estimates in back- 


| 
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ward economy. Concepts and classifications can be tested as to 
their usefulness and effectively re-formulated only if they rest 
upon a cogent view of the operation of the economy which is 
subject of economic analysis. 


Besides the conceputual problems relating to national income 
studies which have been mentioned above in very brief, there are 
much larger problems which arise due to non-availability of 
statistical data which is urgently needed for estimating the 
national income and construction of social accounts. The 
material for estimates are known to be deficient in all cases. 
It is difficult to get any current data on the economie structure 
of the basic industry and its related activities—no information 
on the cost of consumer expenditure of the population attached 
to land or on their savings if any can be acquired. 
In alternation all that could be done is to make all 
the convincing estimates by means of guesses to fill in the gaps. 
No doubt the need to resort to such guesses in calculating the 
value of economic activities about which reliable information 
is vague and imprecise leave grave weaknesses in the final total. 
But until a country’s statistical reporting system covers directly 
economic activity it will always be necessary to employ guesses 
in our estimates and thus we must expect some margin of error. 
But what we are expected to do in such circumstances is їс 


minimise this margin of error. 


Thus we must emphasize that for estimating and analysing’ 
the national income and various other related totals and accumu- 
lation of intellectual and technical capital which is very much 
required. Basic changes in economic functioning and economic 
intelligence are closely inter-related and if there is to be 
economic development the advancement of material and in- 
tellectual capital is of great importance. 


Questions 


1. Describe the various methods for calculation of 
National income. 

2. Write an essay on Social Accounting technique of 
computing national income. н 

3. In what way national income statistics is helpful im. 
describing the features of an economy. 2 


CHAPTER 22 


‘SAMPLE SURVEYS 


"By a small sample we may judge of the whole piece." 
М1воег De CERVANTES 


Meaning. The use of sample surveys to elicit information 
about the populations from which the samples have been drawn 
is one of the most important applications of the theory of 
sampling. There are two types of surveys, a census or a full 
count survey and sample survey. In a census survey every 
unit of the population under investigation is included in the 
survey. Under a sample survey we observe only a representa- 
tive fraction of the whole population and from it calculate or 
infer something about the characteristies of the population. 
Mr. L. Moss, the Director of the Social Survey U.K. has defined 
a sample survey as “а method of collecting detailed information 
relating to representative groups under controlled conditions. 


Importance of Sample Surveys. А sample survey should 
not be taken as an inferior substitute for a census survey. The 
outstanding advantage of a sample survey over a census survey 
lies in the fact that it is practicable to collect much more detailed 
information from a relatively small number of people than from 
a large number. Mr. Moss has pointed out that, “experience 
seems to show that it is wrong to assume that a census must 
automatically be more correct than a sample" On the other 
hand properly designed sample survey is conducted for testing 
the reliability of census survey. 


! Paper read in а conferencé organised by the Association of 
Incorporated Statisticians on modern sample survey methods in 
December 1953. 
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As sample surveys possess the advantages of practicability, 
greater speed, greater scope, greater accuracy and reduced cost, 
they are extensively used in social, economic and business 


research. 


Sample survey is usually the most efficient mean of 


studying such phenomena. The following are the reasons for 


this— 


(a) 


(b) 


(с) 


Sample survey is sometimes the only possible method. 
When it is desired to test the quality of industrial 
products, sample survey is the only possible method 
of quality control. 


Sample survey is often the only practical method. 
A statistical population is not always infinite but 
frequently it is composed of thousands of items. 
Usually when the universe is large, sample survey 
is the only practical method for analysis. 


Sampling is usually the most effieient method. In 
the words of W. Edwards Deming, “Actually. the 
relatively small number of items drawn for the 
sample survey may be collected with greater accuracy 
and yield better results than a complete enumeration 
of the population.” Whether better results come 
from sample survey or census survey depends on the 
nature of problem and type of population. 


A sample survey may be conducted— 


(a) 


(b) 


(с) 


To sample а full count which has already been 
conducted. Such sample survey is conducted to test 
the correctness of the full count that has already 
been done. Generally after a population census has 
taken place, such surveys are conducted to test the 
validity of the results obtained by a full count survey. 


To obtain information for a special purpose. Most 
of the social, economic and business sample surveys 
come under this. Such surveys cover diverse subjects 
like sickness, demand for a product, economic 
conditions etc. 


To obtain continuous information of the behaviour 
of certain economic quantities. Such surveys are 
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useful for getting up-to-date information and revising 
the results. 


Development of Surveys. The enumeration of populations 
by means of a census is centuries old. There are records in 
India, Egypt, Rome ete that kings carried out censuses for 
military and fiscal purposes. The sample survey however is 
of quite recent origin. The first survey based upon a random 
sample of the population was carried out in 1912 by the late 
professor Bowley in Reading in U.K. He measured the incidence 
of poverty among the working class. With the growth of un- 
employment after the first world war, surveys were undertaken 
in many countries. By 1930, surveys became popular in advanced 
countries like U.K. and U.S.A. In U.S.A. private business 
houses too use this device for gathering information about their 
products. In U.S.A. this technique has become very much 
popular for market and consumer research purposes. In India 
in 1934 Bowley Robertson Committee strongly advised the then 
Government of India to conduct sample surveys to compile 
statistics on economic affairs. But it was in 1950 that National 
Sample survey organigation was established by the Government 
ef India to conduct nation wide sample surveys according to 
the scheme prepared by Prof. P. C. Mahalanobis. 


The Principal Steps in a Sample Survey. Sample surveys 
vary greatly in their complexity. То take a sample from 5000 
cards, neatly arranged and numbered in a file is an easy task. 
It is a difficult matter to sample the inhabitants of a region 
where transport is by water through the forests and where there 
are no maps, where fifteen different dialects are spoken and 
where the inhabitants are suspicious of a stranger and very 
suspicious of an inquisitive stranger. Different sample surveys 
will have different problems to be tackled in different ways, but 
basic pattern and principles remain all the more same. 


The Principal steps in à survey are :— 


1. Statement of the objectives of the Survey—A clear 
Statement of the objectives of sample survey is most desirable. 
The surveyor should satisfy to the question—‘what is the 
problem under review and how may the survey help?” Не 
should make a detailed study of all the facts concerning the 
survey. The maximum information must be obtained for & 
given cost. 
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2. Definition of the population to be sampled—The word 
population in statisties means the aggregate from which the 
sample is chosen. Population to be sampled must be the same 
about which information is to be gathered. The population 
should be compact, and easy to sample. 


3. Determination of the sample and data to be collected— 
The sample selected must be of such size and composition that 
it will give most reliable results. As a preliminary to the 
selection of a sample, the population must be sub-divided into' 
parts, which may be called sampling units. The sampling units 
must together comprise the whole of the population and they 
must be non-overlapping. This will ensure a representative 
sample to be drawn. So far as the data to be collected are 
concerned, they should be relevant to the purpose and no essential 
data should be omitted. Sometimes situation arises after the 
end of a survey when it becomes clear that it would have been 
helpful if only some additional data had been collected on a 
‘particular problem. 


4. Methods of measurement—This relates to the choice as 
to the methods of measurement to be employed. There the two 
main methods viz the postal enquiry and the survey employing 
interviewers. The questionnaire should be properly drafted. A 
badly designed questionnaire may ruin an otherwise well- 
conducted survey. 


5. Organization of the field work—The personnel of the 
survey must be given training in the purpose of the survey and 
in the methods of measurement to be employed. It is desirable 
that the questionnaire be tested by a ‘Pilot Survey’, which is 
survey in miniature. The more experienced interviewers should 
be engaged on this pilot survey, because they are capable of 
assessing the weaknesses of the approach or any questions on 
the questionnaire. The results of the pilot survey are not so 
improtant as are the lessons learned. Before the field work 
starts a briefing conference for the interviewers is normally 
held. Any difficulties met and the lessons learned in the pilot 
survey are examined and a course of action laid down for 


specified circumstances. 


6. Summary and analysis of the data—Soon after the field 
work begins, completed schedules will begin to pour into the 
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‘Survey office. The answers given by the persons interviewed 
are to be scrutinised and mistakes and inconsistencies are to be 
noted. The editing of the data require careful consideration and 
supervision. The data are presented in tables ete and a report 
on the survey is drafted. 


Questions 


1. "A sample survey is never so satisfactory as a full count, 
hence sample surveys should be conducted only when it is impossible 
to get a full count of the population,” Comment. 


2. Outline the principal steps to be taken in a sample survey: 


CHAPTER 28- 


STATISTICAL QUALITY CONTROL 


à “Without quality control you as a producer or purchaser, are 
in the same position as the man who bets on a horse-race—with one 
exception that adds are not posted." Е. M. Srrapman. 


Introduction 


Statistical Quality Control is an important application of 
the theory of sampling in the industrial field. It was the need' 
for vast quantities of highly standardised products in World 
War II, that gave the field of statistical control its impetus. 
Statistical Quality Control methods reduce the load of the 
executive and increase the uniformity and standardised of 
processes and their end products. Quality control is a specialised 
technique used to improve the technical effieiency of production 
processes. It may be defined as the art and science of making 
the most economie use of resources, human and material, in the 
manufacture of goods to satisfy human wants. It may be 
considered as the maintenance of quality in a uniform flow of: 
manufactured products. The feature of modern industry is mass 
production and division of labour. The manufactured products 
may be intended for use, as they are e.g. rolls or cloth, or in. 
conjunction with other parts made elsewhere, e.g. a component 
of a machine. For all the precision of modern engineering no 
two pieces following one another off the same machine are 
identical. The differences may be so small as to be invisible to 
the naked eye, but they exist. 


Causes of Variation in Quality :—There are various ways 
of measuring the quality of industrial products, but when we are 
concerned with mass produced components, quality can generally 
be measured by a fairly simple characteristics of the component 
under consideration. А screw manufacturer may be interested’ 
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in the width of a screw, its hardness, the strength of material etc. 
If he thinks of quality in this way, it will be immediately evident 
that quality will never be absolutely constant. However refined 
the manufacturing process may be, two screws will not be 
identical. There will always be a certain amount of variation in 
quality. Variations in the quality of a product are inevitable. 
But a limit can be fixed beyond which the variations should not 
exceed. Variation in quality can be attributed to two main types 
of causes :— 


1. Chance Causes :—Some variation in quality is certainly 
attributable to chance. This is usually so small and insignificant 
that it may be ignored, in any case, since it is caused by 
numerous independent factors it would be uneconomic and even 
impossible to trace them all. For example, the quality of brass 
castings, being turned on a lathe will certainly vary. 


2. Assignable Causes :—The important and longer varia- 
tions are attributable to assignable causes. There are defects 
in the production process, which by themselves will adversely 
affect the quality of the product e.g. excessive wear on the 
‘cutting tool, bad handling of the machine by operative and so 
оп. These causes can and must be traced immediately, their 
presence becomes apparent if the product is to be of the required 
‘standard. 


Quality control enables those in charge of production to 
verify whether variation in the quality of the product are 
attributable to chance or to assignable causes. If they are of the 
latter type, then remedial action by the executive is called for. І 


‚ Why Quality Control :—One method widely used to ensure 
that defective or inferior quality products are not passed into | 
Stock from the factory is to have an inspection department. f 
Usually the inspection is 100 per cent, every product being 
examined and the worker being paid on the accepted output. 
‘There are two defects in the system. The faulty work is, 
detected only after it has been done, and even if several processes 
have been carried out after the price become faulty, the machine 
`апа labour time wasted is considerable. Further human nature 
is such that even 100 per cent inspection system is no guarantee 
that only satisfactory products will leave the factory. The cost .. 
‘of such inspection department is often considerable. д 
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The Theory of Quality Control :—The best inspection system 
is that which detects the faults as soon as it appears i.e. at its 
origin, while also dispensing with 100 per cent manual inspection 
of the end product by substituting a virtually fool-proof system 
of continuous sample inspection. Such a system is provided by 
the technique of quality control. It is considered that variation 
in quality or size of a product is inevitable and within certain 
limits permissible. First of all, causes of any variations should 
be ascertained. If variation is due to chance causes, it is usually 
so small and insignificant that it can be ignored. 


Technique of Quality Control А standard quality is 
determined taking into account the present development of 
production techniques. For example in a cigarette manufacturing 
company a standard weight say 100 cigarettes may be set by 
taking average of 100 cigarettes in satisfactory operating condi- 
iions. "The standard is fixed considering that the quality is 
acceptable to consumers and cost is according to expectations. As 
production continues minor variations in the weight of successive 
samples msut be expected because of a number of minor causes of 
variation like small differences in the moisture content of the 
tobacco, thickness of paper, texture of adhesive used and other 
such causes. Therefore zones or limits are laid down within which 
variations are permissible. The limit is described by an upper 
control limit and a lower control limit a opposite sides of the 
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standard. Generally Зо 13 taken as the limit because according 
to the sample theory this covers nearly 9995 of the items. If 
observations fall outside these limits some major cause of 
variation is assumed to be at work, there is an assignable reason 
for such a wide variation—some maladjustment in the process. 
When the trouble is found and corrected the observations will 
again return to their expected pattern about the standard within 
the control limits. 


Illustration —1 


A factory using quality control methods mass produces an 
article and past records show that on the average 4 articles are 
found defective out of every batch of 100. What is the maximum 
number of defective articles likely to be encountered in a batch 
of 100 ? 


It is brought to your notice that recently several batches of 
100 were turned out containing 11 to 15 defectives. What 
inference would you draw ? (М.А., B.H.U.) 


Standard Deviation—4/npq 
—V/ 04x 9650100 
=\/8.84—1.95 

93x(1.95—4.85 


4 + 4.85 =8.85, —0.85 (as it cannot be in тіпиѕ) —ог 9 is 
the maximum number of defectives which can be expected. 


If the number of defectives is 11 to 15, the variation is due 
to assignable causes, which must be found out and remedied. 


Advantages of Quality Control. The general objective of 
quality control is to maintain quality. The alternative and 
traditional technique is 100 per cent inspection of the output 
of а product. Comparing quality control with 100 per cent 
inspection, quality control has following advantages :— 


1. As quality control involves inspection of only a fraction 
of the output of a product, costs of inspection are greatly reduced 
and efficiency of inspection increased. 


2. With 100 per cent inspection, unwanted variations in 
quality may be delicted later than with the continuous sampling 
technique of quality control. This means that а greater volume 
of faulty products will have been produced and a greater delay 
in the rectification of faults in the production process will occur. 
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Quality control ensures early deliction of faults and hence a 
minimum waste of reject production. 


3. Quality control enables a process to be brought into and 
held in a state of statistical eontrol i.e. state in which variability 
is the result of chance causes alone. When a process is under 
control the quality of the product can be accurately specified, 
in that limits can be specified within say, 99 per cent of the 
product will lie. So long as statistical control continues, these 
specifications can be accurately predicted for the future, which 
even 100 per cent inspection cannot guarantee to do. Conse- 
quently it is possible to assess whether the production processes 
are capable of turning out products which will comply with any 
given set of specifications. 


4. Whether or not a change in the production process 
results in a significant change in quality can be readily detected 
by quality control. 


5. When the best of quality is destructive e.g. proofing of 
ammunition, etc., 100 per cent inspection is impossible. In such 
cases sampling must be resorted to. The application of proper 
sampling methods of quality control ensure not only that the 
quality is controlled, but also that valid inferences about the 
total output are drawn from the samples. 


6—Statistical quality control methods also help (i) to 
determine on a basis of cost and convenience, the size of the 
sample in relation to the reliability it is desired to be placed 
on the test results and whether the quality shall be appraised 
by means of the average test value or the range in the test 
values or the proportion of the test values above or below a 
specified value or by some other statistical parameter, (ii) to 
trace the sources of variability, to identify assignable causes and 
to assess the relative magnitudes of the variation arising from 
different sources (iii) to assist adjudication between rival 
proposals for reducing variation when a contributory cause has 
been identified and between rival machines, operators and 
processes, and (iv) to improve the conditions governing the sale 
of products to consumers. 


Thus regular use of statistical quality control methods will 
change the role of the inspection from that of an *unsympathetic 
detective to that of a helpful constable." 
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Conclusions. In its simplest form quality control represe 
an application of the normal curve theory-and sampling. ў 
sampling in the ordinary meaning of the term permits m 
approximate assessment of the whole, with statistical methods it SG 
enables one to judge whether or not successive small sampi 
Lgs of a different quality from each other. 


j ; "Theoretical. Question 
© 1. Write a short essay on Statistical Quality Control. 
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CHAPTER 24 


* 


INTERPRETATION OF 
STATISTICAL DATA 


"Unfortunately, errors of method and interpretation of 
quantitative data are common—mainly because statistical methods 
of analysis and the nature of statistical results are unknown to 
many who attempt to analyse and interpret quantitative data." 

NrIswANGER 


Meaning and Importance. Interpretation stands for the 
technique of drawing inferences from an analytical study of the 
collected figures. Statistics is a science and, as such, a person 
who is not fully acquainted with the technicallities of it is not 
in a position to understand the significance of various statistical 
measures. A statistician, besides the collection and analysis of 
data, has to draw inferences and explain their significance to 
the lay-man. The task of drawing conclusions or inferences and 
of explaining their significance after a careful analysis of selected 
data is known as interpretation. As a matter of fact’ inter- 
pretation is the main function of the statistician and the 
function of collection and analysis is just an auxiliary function 
as a necessary precedent to interpretation. 


The statistical data after they have been collected, presented, 
analysed are interpreted. The task of interpretation of statisti- 
са] data is a specialised one and therefore only experts can draw 
logical conclusions from them. A statistician is expected not 
only to assemble and analyse the data but to interpret the 
result of his findings. As a matter of fact, interpretation is the 
main function of a statistician. The collection and analysis of 
data are regarded as an auxiliary function. Though correct 
interpretation depends upon proper application of statistical 
methods in the collection and analysis of data. The work of 
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interpreting statistical data is of a technical nature, and if 
left to inexperts, is likely to lead to misuse of statistics. Any 
person without adequate knowledge of statistical methods as 
well as statistical operations and their implications is apt to 
draw wrong conclusions. In interpretation of statistical data 
besides technical knowledge common sense is equally important. 
But common sense is rarely common, and this leads to wrong 
interpretation of statistical data. А statistician can draw 
precise inference if he is fully conversant with the problem 
under study and its implications. For this it is necessary that 
the whole work connected with the particular problem should be 
entrusted to the same person, and if this is not practicable due 
to heavy work, at least it should be under his supervision and 
guidance. 


Extra care has to be taken in interpretation results where 
statistical methods are used. Statistical methods are used 
where Law of single variable can not be applied. According to 
Prof. Yule statistical methods are, “especially adopted to the 
elucidation of quantitative data: affected by multiplicity 
of causes.” This is in fact the fundamental difference between 
the experimental and statistical methods. The data specially 
pertaining to social sciences reflect a complexity of causes. This 
fact makes it all the more difficult the task of interpretation. 
Statistical inferences must be drawn with careful regard to all 
these limitations. The value and uses of statistics lie not in 
the figures themselves but in the deductions to be obtained 
therefrom. The distrust of statistics is mainly due to conclu- 
sions, drawn by inexperts, unwarranted by facts. 


As statistical method is not entirely automatic in its 
operation, accuracy in results depends to a considerable degree 
on the personal qualities and abilities of the interpretor. He 
must be creative of ideas and should strike a nice balance 
between being too forfetched and fanciful on the one hand and 
being conservative and unwilling to admit new facts. Common- 
sense is as much a chief requisite and experience as great а 
teacher in the delicate task of interpretation as they are in the 
other pahses of statistical work. Freedom from bias and 
prejudice is necessary all the more in interpretation as it is 
interpretation with which the layman is concerned. Great care 
should consequently be exercised in the interpretation of 
statistical material that any generalisation may be kept within 
the limits of the evidence. 
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Pre-requisites of Interpretation. As the work of inter- 
pretation of statistical data is of difficult nature extra care and 
minute attention to details are required. In order to draw 
reasonable and correct conclusions, the following considerations 
should be borne in mind. 

1—Adequacy of data—Inferences will be correct and 
reasonable when they are based on adequate data. Not only 
that the items should be properly selected, but the observations 
taken must be large enough to present a correct picture of the 
parent universe. According to the Law of Inertia of Large 
Numbers, inferences will be correct only when the number of 
observations is sufficiently large. 

2—Homogeneity—When cases are being compared for 
drawing inferences, it would not be possible to do so without 
homogeneity. If bad debts are being reviewed with a view to 
draw same conclusions, it would be a mistake to compare the 
percentage of bad debts in two concerns when in one they are 
calculated on sales consisting of cash as well as credit and in 
the other on credit sales only. In short, there must be logical 
consistency. 

3—Stability—Before accepting the data for interpretation, 
it must be seen that there is likely to be stability of the data. 
The test of stability is that the results repeat in similar 
experiments. If there is lack of stability, interpretation will 
not be of any use. 


4—Relevancy—The data taken for interpretation must be 
relevant to the problem. There must exist some significant 
relation between the purpose of enquiry and the results of the 
analysis and the drawing of inferences. In the absence of this 
condition, inferences drawn will not be reliable. 

5—Accuracy—The data must be free from all types of 
errors—biased as well as unbiased. 

6—Scientific Analysis—It is also required that the data 
must have been scientifically worked out and analysed. 

If above facts are kept in view, the chances of wrong 
interpretation will be considerably reduced 

Before interpreting statistical data following questions 
should be answered, and if satisfied with the answers, inferences 
should be drawn :— 
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l—Are the results reasonable ? 

2—Is it logically possible to reach any other conclusion ? 

3—15 a solution based upon past conditions sound аз a 
basis for predicting future conditions ? 

4—18 the group of data sufficiently large to give a depend- 
able indication of the characteristics as they apply to the 
problem in hand ? 

5—What are the methods applicable to the data ? Is there 
too much idealisation ? 

6—Are the data accurate and complete enough to justify the 
application of the methods used ? 

1— Ате there cumulative or other errors in the compilation 
orcaleulations ? Or will the application of the results necessarily 
involve cumulative errors ? 

8—Has there been comparison of non-comparable data ? 

9—Have percentages been erroneously used ? 

3 10—Науе previous convictions influenced either the se:ection 
of data or the application of statistical methods ? 

11—Have any important factors which may be discovered 
by logical reasoning been neglected in the analysis ? 

12—Has there been repeated use of methods which will 
introduce error ? 


Sources of Error. Errors in interpretation are due to 
various factors—intentional or otherwise. There is an innate 
desire in man to interpret the data and in doing so very often 
he disregards his limitations. He tries to interpret data without 
proper mental preparation and scientific training. The different 
sources of errors in interpretation of statistical data are :— 


1—The average used—In certain cases conclusions based om 
averages lead to wrong inferences. Errors are frequently made 
by imputing to each member of a group the average behaviour 
of the group. This is an error in deduction i.e. in moving from 
a generalisation concerning the group to an individual member 
of it, which usually cannot be done in statistical interpretation 
because of the nature of statistical results. If we say that 10% 
of the dwellers of a city put on Bata shoes, thus in a family 
where there are ten members 1 member put on Bata shoe, will 
be a wrong conclusion. 
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2—Drawing unwarranted conclusions—Generally conclusions 
are drawn unwarranted by the facts. If it is concluded that 
Indian women have become less fashionable, because imports of 
cosmetics have reduced, may not be correct. There may be 
other reasons for reduced imports. Domestic production of 
cosmetics might have increased so as to leave little room for 
imported cosmetics. 


3—Drawing conclusions from an argument running from 
effect to cause—An interpretor must be constantly on guard in 
assigning cause and effect relations. If it is said that the 
number of cine-goers has increased therefore it indicates that 
standard of living of the people has increased. The increase in 
number may be due to change in the pattern of consumption. 
Therefore one should be very cautious while relating such cause 
and effect relationship. 


4—Unequal basis of comparison—Comparison between two 
sets of data will yield correct results only if they are on equal 
basis. Most of the social and economie data like Income, wealth, 
death rate etc., are collected from different populations having 
different characteristics. Averages or rates of one area 
compared with another may give misleading conclusions 
because the population bases are not entirely comparable. # 


5—Coefficient of correlation—Coefficient of correlation is an 
outcome of mathematical process. Even if two series un- 
connected with each other, but having similar movements may 
be shown to have a high degree of correlation because of 
coefficient of correlation. But this does not mean that there is 
correlation between them. Because two groups can be shown 
to have a fairly high degree of correlation, this must not be 
taken as evidence that both are related. There may be other 
factors also responsible for their relationship. 


6—Irrelevant Data—Conclusions are sometimes drawn with 
regard to a problem, from the data unconnected with the problem. 
Inferences should be made only from the data relevant to the 
purpose. 

—Coefficient of Association— Coefficient. of association may 
also give fallacious conclusions. If there is positive association 
between coaching and passing of an examination, it should not 
‘be immediately concluded that coaching is essential for passing: 
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in the examinations. There are other methods also of obtaining 
‘knowledge. 


8—Use of Percentages—Percentages are used to show the 
change in an aggregate when the relative change is important 
rather than the actual amount of change. Mistakes of inter- 
pretation result from the use of percentages. If in a college 
there are two students appearing at M. Com. examination and 
both of them got through the examination, the result is 10046. 
In another college there are 100 students appearing at M. Com. 
examination and out of them 80 passed and the result is 80%. 
If would be wrong to conclude that first college is better than the 
second, because number of students is not the same. 


9—Use of Index Numbers—Use of index numbers may also 
lead to wrong conclusions. Index numbers are subject to the 
limitations of averages generally. Further, complete data on 
which to base their calculations are rarely available. The 
results are largely derived from sampling and are therefore 
subject to a somewhat indefinite margin of error. The type 
of error is such that cannot be measured mathematically. 


10—False Generalisations—Errors are also caused due to 
false generalisations and faulty use of statistical methods. For 
example, if it is argued that wages were lowered by 20% and 
were later on raised by 20%, hence there is no cause of 
complaint. But fact may be that, a labourer who was getting 
originally Rs. 100 P.M., when wages were reduced got Rs. 80 
and when wages were increased his wages became Rs. 96 and 
he lost Rs. 4 in these changes. 


General Directions. The work of interpretation of 
statistical data is of a complicated nature. No hard and fast 
rules can be laid down. However some general directions will 
guide an interpretor. A statistician must carefully satisfy him- 
self regarding possible sources of errors, before he sets about 
the task of interpretation. The statistician must realise that 
the task of interpretation requires besides other things, sound 
common sense and a maturity of thought and judgement. The 
available statistical data should be subjected to various statistical 
operations and devices. By calculating mean, median, mode etc 
a concise picture of the group should be obtained. These 
measures also help in making comparisons. If the data forms 
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а time series, measures of analysis of time series should be used. 
Graphs, if drawn, will be very useful and thereby interpretor will 
be able to easily notice certain tendencies in the data as it is 
difficult to do so by merely seanning the tables. То study the 
inequality, lernz curve may be made use of. It will be a serious 
mistake if the data are interpreted without studying the back- 
ground as well as general implications. It is very likely that 
cireumstances might have changed materially in the period 
under consideration. 


Illustration—1 


Point out ambiguity or mistake, if any in the following 
statements :— 


(a) The death rate in the American Navy during the 
Spanish American war was 9 per thousand while in the city of 
New York for the same period it was 16 per thousand. It was 
safer then to be sailor in the American Navy than to live in the 
city of New York. 


This conclusion is perfectly wrong because comparison has 
been made between different types of populations. Conditions 
of living are very much healthier in the Navy than they are in 
the city of New York. Moreover Navy consists of young and 
healthy people ; therefore mortality rate is bound to be lower 
than that of the New York city where old and young sick and 
healthy alike live in surroundings which are not always conducive 
to longevity. This comparison has not been made taking all the 
facts into consideration. у 


(b) The Per Capita income for India in 1931-32 according 
to the estimates framed by Dr. V.K.R.V. Rao was Rs. 65. The 
estimate for 1948-49 framed by the National Income Committee 
was Rs. 225. In 1948-49 India was therefore four times more 
prosperous than in 1931-32. 


This conclusion in based upon wrong interpretation of facts. 
The price level in 1948-49 was many times more that what it was 
in 1931-32. The year 1931-32 was a year of depression period 
when prices were touching the bottom. In 1948-49, which is а 
post war year, prices were very high. Hence unless per eapita 
income is converted into real income based on 1931-32, it would 
not be correct to conclude that India was four time prosperous 


in 1948-49 as compared to 1931-32. 
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(c) The examination results of school X was 7595 in a 
particular year. In the same year at the same examination 
only 400 out of total of 600 students were successful in school Y. 
The teaching standard of the former school was decidedly better. 


Unless the number of students in school X is also same, such 
conclusion will be wrong. It is possible that there may be only 
four students in school X out of them three passed. Here 
wrong interpretation has been made due to misuse of percentages. 


Questions 


1. What kind of mistakes are generally made in interpreting 
Statistical data ? Give examples. (B. Com. Alld.) 


2. Interpret the data given below and illustrate any two 
series given by a suitable diagram :— 


World |World cul- [World pro-| World 
Quantity of Country land tivated | duction of | popu- 
area area cereals lation 
Asia excluding U.S.S.R. 18.6 32.9 31.0 53.1 
North America 17.8 21.2 21.5 8.2 
U.S.S.R. 16.1 16.8 22.0 7.6 
Europe excluding U. S. | 
$. В. 3.7 168 | 160 17.9 
Mid and South America 18.2 5.7 4.5 5.0 
| = 
Africa 24.1 5.6 4.0 77 
Осеапіа 7.0 1.5 1.0 0.5 
Total —— 100.0 | 100.0 100.0 100.0 
КОЖО С MI UNIDAD 


(M. А., Allahabad.) 


3. How far do you agree with the conclusions drawn in the 
following cases :— 


(a) It is observed that intelligent fathers have intelligent sons ; 
and intelligent grand-fathers have intelligent. grand-sons, therefore. 
intelligence is hereditary. 


(b) Two series—quantity of money in circulation and general 
price index—are found to possess positive correlation of a fairly 


! 
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high order. It is concluded that one is the cause and the other 
the effect in a direct causal relationship. 


(c) It is observed that generally death rates in two towns are 
identical. It is inferred from this that the populations of both 
the towns are equally healthy. (M. А. Rajputana.) 


4. Comment on the following inferences :— 


(a) The population of Bhopal has doubled during the last 
four years. Therefore the birth rate of the town has also doubled. 


(b) The import of food-grains in India is increasing. The 
production of food-grains in India is therefore declining. 


(c) Income from excise duties in India is increasing year 
after year. Therefore production is also increasing in India. 


5. Point out the mistakes in the following inferences :— 


(a) There are 500 employees in a factory. Their daily 
earnings are about Rs. 2—00 per day. Therefore the total monthly 
wage bill of the factory is Rs. 30,000. 


(b) Ап ordinary person in India uses one pair of shoes every 
year. "Therefore the total annual demand of shoes in India by 
her 44 crores of people is 44 crores of pair. Е 


(c) А vast majority of students т a hostel spend Rs. 100 per 
month. Therefore the total monthly expenditure of the 50 
students. of the hostel is Rs. 5000/-. 


(d) A merchant receives usually 100 customers a day. 
"Therefore, the total number of customers received by him in a 
month is 8000. 


(e) Most of the patients die in the emergency ward of the 
«Ну hospital, therefore it is unsafe to be admitted to the ward. 


APPENDIX 
Mathematical Tables And How To Consult Them 


For understanding statistical methods, elementary knowledge 
of a few essential principles of arithmetic or mathematics is 
essential. Below are given а few rules regarding calculations etc, 
which will be helpful to the students of statistics. 


A—Rules for determining the decimal point. 

1—In the case of multiplication the number multipled is the 
multiplicand, the number multiplied by is the multiplier, and the 
result is the product. 

Rule :—The number of decimal places in a product is equal 
to the sum of the decimal places in the multiplier and the 


multiplicand. 
For example— 
9X.81= .729 
12x 6=7.2 


2—When raising a number to a power, multiply the number 
of decimals in the number by the power to which it is raised to 
obtain the number of decimals in the result. 
For example— 
(0. 7)2=.49 
(0.11)2—.0121 
(0.25)?—.0625 
(0.25)3=.015625. 


3—When dividing, the number divided is the dividend, the 
number to be divided by is the divisor, and the answer is the 
quotient. 


Rule :—The number of decimal places in a quotient is equal 
to the number of decimal places in the dividend less the number 
in the divisor. 

For example— 

8.40 divided by 2 = 420 (Two—0) 
8.40 divided by .2— 49.0 (Two—1) 
8.40 divided by .02—4920 (Two—2) 
8.40 divided by 20 =  .42 (Two—0) 
8.400 divided by 200 = .042 (Three—0) 

4—When taking the square root, the number of decimals is 
equal to one half digits to the right of the decimal point in the 
number for which the square root is to be found, including added 
ciphers. 

For example— 


144 —12 260 =5 
144 = 13 or. 5 
.0144—  .12 .0025—.05 
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B—Rules of signs. 


} 1—Addition :—When adding a series containing both plus and 
minus quantities seperately and then to take the difference between: 
two subtotals, giving the difference the sign of the numerically 
larger sub-total. У 

For example— 

2--5—3-4-6—7—4—13—14— —1 
2—Subtraction :—Subtraction involves only two numbers. 
The rule always to be observed in the subtraction follows : change 
the sign of the ey to be subtracted and proceed as in addition. 
‘or example— 
4—(+2)=(4—2)=2 
8—(—2)= 8+2 =5 

8—Multiplication :—The result of any multiplication is 
positive unless an odd number of negative values are being 
multiplied. If there are an odd number of negative values 
multiplied the result is negative. 

For example— 

(+8) X (+2) X (2-4) — 4-24 
(+2) (2-6) X (—1) ——12 
(—4)X (—2) X(+5)=+40. 

4—Division :—Like signs whether positive or negative, give 
а positive result ; unlike signs negative. 

For example— 

(+25)+(+5)= 5 

(+64)+(—8)=—8 

(—35)2-(—8)— 7 
C—Logarithms. 

The use of logarithms makes easier many of the computations 
in statistics. Sometimes results are secured through the use of 
logarithms which could scarcely be attained in any other way. 
In statistics logarithms are principally used in constructing ratio 
graphs, in computing geometric mean, in fitting trend lines and 
in finding powers and extracting roots. f 

The system of logarithms is on the base 10 and the logarithm 
of a number is the power to which 10 must be raised to obtain 
that number. Thus 10? is 1010 and equals 100, so that the 
logarithm of 100 is 2. The logarithm of 1000 is 3 because 
1010101000. It is convenient to find out logarithm of such 
numbers without consulting any table, but for other members logs 
cannot be found out so easily. 

The log of a number consists of two parts—The characteristic 
and the Mantissa. The number which refers to the integral 
power of ten is called the ‘characteristic’ of the logarithm. 
Characteristic is determined by applying the following. rules. 
We need not consult any table for finding out characteristic. 

1—14 the natural number is greater than one, the characteristic 
of the logarithm is 1 less than the number of digits to the left 


of the decimal point. 
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CTI) 
58 — Ж, 
537 — 2 
5459 CS] 
54596 LY 
5 =—1 
05 =—2 
005 =—8 
.0005 =—4 
00505 =— 8 


Finding the Mantissa 
The fraction value is known as the ‘mantissa’ of the logarithm 
'and mantissa values are to be found by consulting specially 
"prepared tables. There are two things which should be 
remembered about mantissa— 
(a) Mantissa is always positive. 
(b) Mantissa is not affected by the position of the decimal 
point. Thus mantissa of 525, 52.5, 5.25, .525, .0525 
will be the same. 


The procedure for finding mantissa of a given number is 
that the figure is reduced to 4 digits by approximation if necessary. 
The first two digits are seen in the left-hand vertical column and 
we read off the figure given before it in the column heading of 
the third digit. То this figure thus obtained we add the quantity 
"appearing under the ‘mean differences’ under column heading of 
‘the fourth digit, 


According to this procedure the mantissa of 7867 is 8673 
(ie. 78 in column 6 is 8669 and in the 7th's mean difference 
‘column we find the figure 4 which added 8669-|-4— 8673). 


Thus logs of 


6770 —189.8306 
677 =2.8306 
67.7 —1.8306 
6.77 0.8806 
.677 — 1.8806 
:0677 9.8806 

75. —1.8751 
999. —2.9996 
10.54 —1.0229 


(Note when characteristic is in minus, the sign of minus is placed 
above the characteristic and not at its customary place. Because 
mantissa is always in plus.) 


Finding the Antilogarithm 
If we have the logarithm and wish to determine its natural 
number, the table of anti-logarithm will have to be consulted. 


678 


А log consists of the characteristic and mantissa. For finding out 
log, we consult table only for mantissa part, the characteristic part 
is determined by certain rules. Therefore when we have to find 
out antilog of a given number, we shall see the mantissa part only, 
i.e. the digits after the decimal point. The procedure for consulting 
anti-log tables is the same as for log tables. The place of 
decimal is determined by the characteristic part of the given 
number, i.e. the number of digits given before the decimal point. 
The antilogs of certain numbers are— 


Log Anti-log 
0.7492 = 5.613 
1.7492 — 56.18 
2.7492 561.3 
3.7499 =5613 
1.7492 = 16618 
9.7492 = 05618 
3.7492 = 0005618 


Use of Logarithms 


(a) To multiply numbers add their logarithms. To find the 
product of two or more numbers, find the sum of their logarithms 
and find out antilogarithm. In adding logs, it should be kept in 
mind that mantissa is always in plus, characteristic may be either 
in plus or minus. Symbolically 

aXb=antilog of (Log a--Log b) 

Example— 

(i) 135X12.2 
Log of 135 2.1808 
=; 12.2=1.0864 


8.2167 
antilog of 3.2167—1647 
(ii) 6740.048 
Log of 6740 8.8287 
"m .048—39.6385 
9.4622 
antilog of .4622=.02898 
(b) To divide one number by another, subtract the logarithm 
of the latter from the logarithm of the former and find out antilog 
of the difference. Symbolically 
a+b=antilog of (Log a—Log b) 
Example— : 
(i) 576-82 
Log of 576=2.7604 
821.5051 


» 


1.2553 
48 
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Antilog of 1.2553—18.00 
(ii) .0007——-.003 

Log of .0007—4.8451 

» | 008 —3.4771 


г. 1.3680 
_ Antilog of 1 .3680—.2333 


(c) To raise a number to a certain power, multiply the 
logarithm of the number by the exponent of the power. Find 
out antilog of the product. 

'  .*—Antilog of (Log ахх) 
Example— 
(i) © (2.54) 

Log of 2.54=0.4048 X 6=2.4288 

Antilog of 2.4288—268.4 

` (d) =^ 0991)? 

Log of .0?^^ = 2.9963 X3=4.9883 
(In multipliing 2 . з by 3, 2 is carried forward from the 
mantissa to the characteristic is subtracted from the product of 
8 and 2 and thus the characteristic of the product is 4.) 


Antilog of 1.9883—.00097277 
(d) To extract any root of a given number divide the 


logarithm of the number by the index of the root. Find out 
antilog of the result. Thus 


at L | 
V a=antilog of ( = : | 
Example— 
(i) 5/3125 
Log of 3125—3.4949—-5—.6989 or 
=.6990 


Antilog of .6990—5 
(н) “/.00325 
Log of .00325— 3.5079 
_ То divide 3 .5079 by 6 we shall have to write it as 
8--3.5079 because in 8.5079, the characteristic is negative and 
the mantissa is positive and the division is not possible with the 
figures as they are. Hence 


1.35079:-6— 1.5848 
Antilog of 1.5848—.3844. 
D—Reciprocals. 


Not infrequently in statistical work, reciprocals have to be 
found out. The reciprocal of a given number is unity divided by: 
that number. Thus the reciprocal of ; 

$ 1 1 1 


i= ‚ 4= REC о etc. 
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The tables of reciprocals give reciprocals of all numbers given 
in the left-hand vertical column. The third digit of the given 
figure is to be seen in the top horizontal column, while the fourth 
digit in the column of mean differences. The figure appearing in 
the mean differences column is not to be added but deducted. 
Then, it should be remembered that if the decimal point moves 
by one digit to the right in the given number it moves by one digit 
to the left in the reciprocal. The reciprocals of certain numbers 
are given below :— - j 


Number Reciprocal Number Reciprocal 
5 0.2 0.6394 1.566 
12 0.08335 0.0322 51.06 
315 0.003175 0.0045 222.2 


E—Extracting Square Root. 


There are three ways of extracting the square root of а 
number viz. 


(a) by the use of logarithms 
(b) by the use of tables 
(c) by the longhand Method. 


(a) The use of Logarithms :—By this method square root 
of a given number will be antilog of logarithm of that number 
divided by 2. 

For example the log of 144 is 2.1584, divided by '9 it comes 
to 1.0792 of which antilog is 12. 


(b) Use of Tables :—Square root a number can also be found 
out by reference to Tables given at the end of this appendix. 


(c) Longhand Method :— 


(1) Beginning with the unit's place, mark the digits by 
twos. The digits after a decimal point is marked 
from the first figure after decimal point. 


(2) Find the greatest square for the first pair or if not 
pair for the single digit. The square root is 
calculated as— 

2772030, 
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CHAPTER 1 


INDIAN STATISTICS 


“The statistics of India have largely originated as a 
by-product of administrative activities, such as the collection. of 
land revenue, or from the need of information relating to 
emergencies, such as famines. Only in the case of the population 
census and to some extent of foreign trade has there been an 
organisation whose primary duty is the collection of information. 
As a result the statistics are unco-cordinated and issued in various 
forms by separate departments. The situation cries out for 
overhaul under the control of а well qualified statistician.” 


Bow .ey-Rosertson COMMITTEE—1934. 


Meaning. The term Indian Statistics implies a study of the 
various sources and organisations through which statistical data 
are being compiled in this country, and it also considers the 
criticism of the present organisation as such along with suggestions 
for improvements. The study also covers both official as well as 
non-official organisations. The importance of statistical data has 
never been more evident than it is in the present time. Since 
India got independence, great and rapid strides have been made 
in the field of collection of data. The Indian Statistics in the 
modern sense of collection and interpretation, developed since then. 
India has chosen the path of planned economic development. In 
the context of economic planning importance of statistics in the 
country has become great. Economic progress through economic 
planning is not a miracle which a country can achieve over-night. 
It takes time to build up a sound economy. Statistics are necessary 
for framing and judging the progress of economic planning. 

Growth of Indian Statistics. A modern state is not only 
concerned with maintaining law and order but also in collecting 
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facts, for formulating various economic policies and judging their 
results. Certain statistics are collected automatically when state 
discharges its'functions. Such statistics are the by-products of 
administration. Statistics are also collected by the state at its own 
initiative. In our country too, like other olden countries, statistical 
facts were collected by the state, in ancient times. Such statistics 
used to be collected either to assess the military strength of the 
state or to know the taxable capacity of'the people. As early as 
300 B.C. India had such statistics. Though there are records like 
“Kautilya’s Arthashastra” and “Ain-e-Akbari” etc., which show that 
state collected statistics in India in olden days, but the collection of 
statistics as a function of the state and according to the modern 
concept of the term, started with the British administration, and 
that too after 1868. The annual ‘Statistical Abstract of British 
India’ was published in 1868 for the first time. It was published 
from London, and continued to be published every year till 1923 
from there. After that, its publication started in India. In the 
year 1874, Sir John Strachey, the then Governor of North-Western 
Province (now called Uttar Pradesh) suggested to the Secretary of 
State for India that a department should be created for the collec- 
tion of statistical information regarding trade and agriculture. 
He also suggested that a Director of Agriculture and Commerce 
should also be appointed. It was on his suggestion that a Depart- 
ment of Agriculture and Commerce was set up in that state in the 
year 1875. One of the main functions of this department was to 
collect trade statistics and to suggest ways and means of improving 


agricultural statistics. А little later on the recommendations of the. 


Indian Famines Commission Agriculture Departments were opened 
in other states, and the Central Agricultural Department which 
was created in 1871 but which was closed due to financial stringency 
was also revived, in order to co-ordinate the work of the various 
provincial agricultural departments. Though these agricultural 
departments were primarily concerned with the improvement of 
agriculture, yet they collected valuable statistical material on 
various agricultural problems. In the year 1881 the first popula- 
tion census was taken. In the same year the Imperial Gazetteer 
of India was published for the first time. It contained economic 
statistics of the different parts of the country. During the last few 
years of the 19th century other departments of the Government of 
India also started collecting and publishing statistical information 
relating to their subject. In 1883, a Statistical Conference was 
held at Calcutta. "The Conference recommended the institution of 
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all-India crop forecasts, and the conducting of Quinquennial 
Livestock Census. - Thus in 1894, for the first time there were crop 
forecasts for wheat production. ~ In subsequent years forecasts were 
made for other agricultural commodities also. In the year 1886, 
a publication entitled *Returns of Agricultural Statistics of British 
India’ was published. In 1895 a Statistical Bureau was set up to 
co-ordinate both agricultural statistics and foreign trade statistics. 
The Bureau was. headed by the Director-General of Statistics. 


In the year of 1905, the office of the Director General of 
Commercial Intelligence was established at Calcutta. The setting 
up of this organisation was a landmark in the history of development 
of statistics in India. The Director General of Commercial 
Intelligence réplaced the Director General of Statistics. Не was 
to bring about a liaison. between the Government and trading 
community in addition to the work done by Director General of 
Statistics. At the time of its creation, the department was assigned 
the following functions :— 


(i) to collect business statistics and to help commerce and 
trade. (ii) to provide a meeting ground for Indian and foreign 
businessmen (iii) to compile and publish statistical data which 
were formerly published by the Government of India relating to 
items of commercial, judicial, administrative and agricultural 
importance. Before World War I this department was publishing 
the following journals :— 

(i) Review of Trade of India. 
(i) Statement of Foreign Sea-borne Trade and Navigation 
of British India. 
(їй) Statistical Abstract for British India. 
- (iv) Estimates of Area and yield of principal crops in India. 
(v) Agricultural Statistics of British India. 

The Department of Commercial Intelligence brought out the 
first issue of ‘Indian Trade Journal in 1906, In the year 1912 
when the headquarters of the Government of India were shifted 
from Calcutta to New Delhi, it was decided to separate statistics 
from commercial intelligence. This move dislocated the work of 
both the departments. Due to lot of inconvenience faced, it was 
again decided in 1922 to merge both the departments. The 
designation of the head of this department was then changed to 
the Director-General of Commercial Intelligence and Statistics. 


_ The First World War brought to the fore the deficiencies in 
the statistical material then available in the country, The country 
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got an impetus for industrial development during the War period. 
Consequently, the question of collection of industrial statistics came 
to the forefront. The Industrial Commission, which was 
appointed by the Government of India in 1916, made pertinent 
observations regarding the responsibility of the Government for 
collection, compilation, careful analysis and judicious distribution of 
commercial statistics and intelligence, both in peace and war. The 
recommendations of the Commission were very valuable so far as 
the development of Indian Industrial Statistics was concerned. 
The appointment of a Director of Commercial and Industrial 
Intelligence was also recommended by the Commission. This 
officer was to collect, compile and present the statistics relating to 
foreign trade tariffs, industrial production and other trade and 
industrial statistics. No action was taken on these recommendations 
by the Government of India. 

In 1925, the Indian Economic Enquiry Committee was 
appointed under the chairmanship of Sir M. Visvesvaraya to, 
"examine the material at present available for framing an estimate 
of economic condition of the various classes of people of British 
India ; to report on its adequacy ; and to make recommendations 
as to the best manner in which it may be supplemented and as to 
,the lines on which a general economic survey should be carried 
out with an estimate of the expenditure involved in giving effect to 
such recommendations.” The Committee made a thorough study 
of the problems referred to, and made the following valuable 
recommendations : 

(i) In large industries, а quinquennial wage census be 
undertaken. 

(ii) Statistics be collected regarding total quality and value 
of products of cottage industries and of raw materials 
consumed. 

(iit) Data be collected regarding the total number of 
workers employed. 

(iv) A Central Statistical Bureau with a view to provide a 
common purpose and central thinking office for 
statistics be established. 

(v) Provincial statistical bureaus should be set up. 

(vi) The various statistical organisations be legalised to 
facilitate their work. 


The recommendations of the Committee were only partially 
accepted by the Government. Most of the crucial recommenda- 


tions were ignored. ge 
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In 1928, the Royal. Commission on Agriculture suggested the 
creation of an Imperial Council of Agricultural Research to 
promote, guide and co-ordinate agricultural research. Though the 
Government of India did not accept the recommendations of the 
Commission in toto, but later on by a resolution passed on the 
4th August, 1930 the Council of Agricultural Research was consti- 
tuted as a department of the Government of India. 

Another important event in the development of statistics in 
India, was the appointment of Bowley-Robertson Committee in 
1934. Bowley and Robertson, the two noted economists of the 
U.K. were invited by the Government of India to :— 

(i) facilitate the further study of economic problems of 
India ; 

(ti) give views on existing statistical information and 
organisation with special'reference to the gaps ; 

(ui) suggest means for filling them ; 

(iv) recommend about the organisation of a Central 
Statistical Department to collect and co-ordinate 
statistical enquiry for whole of India ; 

(v) discuss the раону and scope of a census of 
production ; 

(vi) give critical view du the material available. for 
measuring national income ; and 

(vii) give recommendations on the construction of Index 
Numbers of prices, wages and production. 3 

The Committee recommended the appointment of a permanent 
staff with a Director of Statistics at the centre. The other recom- 
mendations of the Committee were concerned with problems of 
organisation of statistics, rural and urban surveys, census of produc- 
tion, and measurement of national income. These recommenda- 
tions could not be implemented then by the Government. In 1938 
an Office of the Economic Adviser to the Government of India 
was created, for collection and analysis of economic statistics. 


With the out-break of the Second World War need was felt 
for collecting statistics about a large number of problems. As a 
result of such a need small statistical organisations were set up in 
a number of central and provincial departments. In 1942 
Industrial Statistics Act was passed. The Department of 
Industrial Statistics conducted the first Census of Manufactures in 
1946. ‘The Labour Bureau started constructing cost of living index 
numbers for certain urban and rural areas with the base year 1939. 
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The Economic Advisers office also started publishing the General 
Purpose Wholesale Price Index Number. In the year 1949, the 
Government of India appointed a National Income Committee, and 
since then regular estimates of national income are being published. 
From the year 1950, а scheme of National Sample Surveys is in 
vogue, under which reliable statistics on various economic problems 
have been collected. In the year 1951 an International Statistical 
Conference was held at Calcutta. The Conference studied statis- 
tical problems which were common to all countries and suggested 
improvements with a view to bring about conceptual uniformity in 
the data collected. In 1953, the Government passed Collection 
of Statistics Act. The Act empowered the Government to collect 
all types of statistics relating to any matter. 


After independence, besides Governmental organisations, 
Indian Statistical Institute, Calcutta, National Research Institutes, 
Research Councils and Universities all are contributing in the 
collection of statistics and adding impetus to statistical research. 
Besides this, certain noted Indian Statisticians like Prof. P. C. 
Mahalanobis, Prof.’ Р. V. Sukhatme. Prof. С. R. Rao have contri- 
buted a lot in the theory of Statistics. у 

The study of Indian statistics will be made under the following 
heads :— ` 

I—Statistical Organisations of India. . 

IL—Indian Statistical Material. This can be studied under 
following sections :— 

... ..A—Agricultural Statistics, 
B—National Income, 
C—Population Statistics, 
D—National Sample Survey, 
E—Price Statistics, 
F—Industrial Statistics, 
G—Trade Statistics, 
H—Financial Statistics, 
I—Labour Statistics. 
III—General Criticism of Indian Statistics. 


| 


СНАРТЕЕ 2 


STATISTICAL ORGANISATIONS OF INDIA 


The present statistical organisations in our country has evolved 
in order to meet the fast increasing requirements of statistical data 
for а variety of purposes. As the country has adopted the path of 
planned economic development, greater importance is being 
attached to statistics. "This has given impetus to the collection of 
statistical data. In recent years there has been a rapid growth 
of the agencies for collecting statistics. Тһе national Government 
have paid due attention to the development of statistical data in 
the country. The emergence of international organisations like 
U.N.O, LL.O, F.A.O. etc. has also given stimulus to the 
collection of statistical data. 

India is a federal country. In the Constitution there are 
three lists, ой. Union List, State List, and the Concurrent List. 
Statistics is included in the Concurrent List. Thus, both the 
Gentral Government and the State Governments can collect 
statistics. The statistical organisations working in India may be 
divided into the following categories :— 

(i) Organisations specially set up for collection and compi- 
lation of data— here are certain statistical organisations which 
have been primarily set up for the ‘collection and analysis of 
statistics of different types. Such organisations are by far the 
most important units of the statistical system of the country. Such 
organisations are, Department of Commercial Intelligence and 
Statistics, Directorate of Industrial Statistics, Labour Bureau, 
Directorate of Economics and Statistics of the Ministry of Food 
and Agriculture, Office of the Economic Adviser in the Ministry 
of Industry and Commerce. There are also such organisations 
in the states. Besides them, а few organisations have also been 
set up for filling the gaps in the statistical data available in the 
country. Such organisations are National Sample Survey, Central 
Statistical. Organisation. 

(ii) Organisations processing data coming as a by-product 
of Administration—There are certain organisational units of the 
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Governmental machinery which collect facts while discharging 
their administrative functions, Central Board of Revenue, Post 
and Telegraph Department, Railways, Roadways etc. are such 
organisations. Collection of statistics is not their primary function, 
but statistics collected by them in the process of administration 
is of great value in formulating policies for further action. 


(iii) Organisations associated with control agenctes— 
Another set of organisations falling under this category, are 
associated with the control of production and Distribution of 
commodities in short supply. Examples of such organisations are 
Office of the Textile Commissioner, Iron and Steel Controller, 
Central Electricity Commission etc. 


(iv) Research Organisations—A number of Government 
organisations are concerned mainly with the research work in 
different fields. Research Department of the Reserve Bank of 
India, Statistical Division of the Indian Council of Agricultural 
Research are examples of such organisations. These organisations 
collect data and draw inferences from them and advise the 
Government for proper formulation of policies. 3 

(v) Non-Governmental Organisations—A number of non- 
Governmental organisations like Indian Statistical. Institute, 
Research and Training Department of the Indian Merchants 
Chamber, Economics and Statistics Department of Tata, etc. 
carry on researches on various problems. 

(vi). Statistical Organisation in the States—In all the states 
there are Directorate of Economics and Statistics. 

Having considered the broad classification, we now discuss 
the statistical organisations Ministry-wise. 


MINISTRY OF FOOD AND AGRICULTURE 


The Ministry has the following statistical units attached 
to it:— 


(i) Directorate of Economics and Statistics 


This directorate was set up in 1947 to collect all the 
agricultural statistics. Formerly a number of departments were 
doing this work and there was duplication and lack of co- 
organisation. Since 1948, the Directorate is the sole organisation 
for compilation, analysis, interpretation and publication of all data 
pertaining to agricultural sector. It also advises the Ministry on 
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agro-economic matters. It publishes а large number of bulletins 
dealing with agricultural statistics. The following are the regular 
bulletins issued by the Directorate :— | 
1—Weekly Bulletin of Agricultural Prices. 
2—Wholesale Prices of Foodgrains, (Weekly) 
3—Agricultural Situation in India. (Monthly) 
4—Agricultural Statistics of India Vol. I, and И. (Annual) 
5—Abstract of Agricultural Statistics. (Annual) 
6—Estimates of Area and Production of Principal Crops in 
India. Vol. I and II. (Annual) 
7—Indian Cotton Pressing Factories Returns. (Annual) 
8—Bulletin on various Crops, (Annual) 
9—Indian Forest Statistics. _ (Annual) 
10—Indian Land Revenue Statistics. (Annual) 
11—Agricultural Wages in India. (Annual) 
12—Agricultural Prices in India. (Annual) 
13—Indian Live Stock Statistics. (Annual) 
14—Bulletin on Food Statistics. (Annual) 
15—Cotton in India. (Annual) 
16—Indian Livestock Census. (Five Yearly) 
17—Average yield of per acre of principal crops in India. 
(Five Yearly) 
18—Indian Agricultural Statistics. (Ten yearly) 
Besides these publications, the Directorate publishes a number 
of ad hoc bulletins. 
(ii) Directorate of Marketing and Inspection 
' This Directorate collects data and publishes reports on the 
marketing of various agricultural commodities like wheat, gram, 
rice, barley and livestock and fishery products like fish, milk and 
milk products. The reports of this Directorate are not issued on 
a regular or periodical basis. 


(iii). Statistical Wing of the Indian Council of Agricultural 
Research 
'The Council has a team of expert statisticians in its statisti- 
cal unit. The functions of the Unit are:— 
(a) advise on the planning of agricultural and animal 
husbandry experiments ; 
(b) scrutinise statistical programmes and progress reports 
of the research schemes of the Council and papers 
received for publication in the Council Journal ; 
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(c) impart training in agricultural and animal husbandry 
problems ; 

(d) carry out fundamental research on the application of 
statistical methods to agriculture and animal husbandry 
problems ; and 

(e) carry out sample surveys for the improvement of 
agricultural, livestock and fisheries statistics. 


This wing of the ICAR has made a substantial contribution 
in the field of agricultural statistics. 


(iv) Other Units 


Besides the above mentioned units, the Ministry of Food 
and Agriculture has a number of other units. They are:— 


1—Rice Research Institute, Cuttack. 

2—Forest Research Institute, Dehra Dun. 

3—Statistical unit of the Central Marine Fisheries Research 

Station, Mandanpam. 

4—Statistical unit of Sugar and Vanaspati. 

5—Central Tractor Organisation. 

АП these statistical units collect and process data relating 
to their respective field and issue statistical publications periodi- 
cally as well as on ad hoc basis. 


MINISTRY OF COMMERCE AND INDUSTRY 


The Ministry of Commerce and Industry has following 
statistical units :— : , 


(i) Department of Commercial Intelligence and Statistics“ 


This is the oldest statistical organisation in India having been 
set up in 1905. In earlier days this department was responsible 
for the collection and analysis of most of the economic statistics, 
but many of its functions have now been transferred to other 
newly created units of the respective Ministries. It is now 
primarily concerned with compilation and publication of India's 
internal and foreign trade statistic. "The department issues the 
following bulletins :— i 

1—Indian Trade Journal (Weekly) 


2. Monthly Statistics of the Foreign Trade of India. Vol. I 
and II. (Monthly) 
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3—Ассошиз relating to the foreign (Sea, Air and Land) 
Trade and Navigation of India. (Monthly) 


4—Accounts relating to Inland (Rail and River borne) 
Trade of India. (Monthly) 


Besides these publications, it also brings out more than a 
dozen publications dealing with internal and foreign trade of India. 


(ii) Office of the Economic Adviser to the Government of India 


This office was created in the year 1933. It compiles and 
publishes data relating to wholesale and retail prices in the country. 
It publishes a series of indices relating to the wholesale prices. 
The official journal of this Unit is ‘Index. Numbers of Wholesale 
Prices in India’ which is published weekly. 

(iii) Directorate of Industrial Statistics 

This unit was set up in 1946, for collecting industrial statistics 
specially to conduct census of manufacture. The Directorate was 
transferred to the Central Statistical Organisation with effect 
from 15 July, 1957. At present the directorate is stationed at 
Calcutta, and is in the charge of a Joint Director of the C.S.O. 
and has two Deputy Directors and an Assistant Director. Now 
the Census of Manufacture has been replaced by an Annual 
Survey of Industries. 


(iv) Department of Company Law Administration 
This department collects statistics relating to companies. The 
following bulletins are issued by the department :— — 
1—Blue Book of Joint Stock Companies in India. (Monthly) 
2—Joint Stock Companies in India. (Annual) 


(v) Other Units 

Besides the above mentioned units the Ministry of Commerce 
and Industry has, under it, the Statistical Section of the Textile 
Commissioner’s office, Statistical section of the Iron and Steel 
Controller, Statistical Division of the Office of the Chief Controller 
of Imports and Exports, Directorate of Commercial Publicity and 
Statistical Section for small-scale industries. 


MINISTRY OF FINANCE 


The Ministry of Finance had a number of statistical units 
under its control and a few of them were transferred to other 
Ministries later on. The National Income Unit was formerly 
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under this Ministry but now it is under the control of C.S.O. 
under the Cabinet Secretariat. Similarly Directorate of National 
Sample Survey was originally under this Ministry, but later on it 
was transferred to the Cabinet Secretariat. At present the 
Ministry of Finance has the following statistical units under its 
control :— 

(1) Department of Research and Statistics, Reserve Bank of India 


'This is one of the most important statistical organisations of 
the country. A large variety of financial statistics is published 
from this organisation. This department has four divisions viz. 
Division of Monetary Research, Balance of Payments Division, 
Statistical Division and Agricultural Credit Division. The 
department publishes a number of bulletins. Important among 
them are :— 

1—Reserve Bank of India Bulletin. (Monthly) 

2—Report on Currency and Finance. (Annual) 

3— Statistical Tables Relating to Banks in India. (Annual) 

4—Report on the Trend and Progress of Banking in India. 

(Annual) 
5—Review of Co-operative Movement in India. (Annual) 


(ii) Statistical Branch (Income Tax), The Central Board of 
Revenue 

This Branch compiles income-tax revenue statistics and 
publishes “АП India Income-tax Reports and Returns’. 
(ili) Statistics and Intelligence Branch (Customs and Central 
Excise), Central Board of Revenue 


This branch collects and compiles statistics relating to central 
excise and customs and publishes a monthly bulletin for official 
use. 


(iv) Budget Division and Economic Advisers Office 

These organisations also publish statistics. "The Comptroller 
and Auditor General of India also brings out every year the 
Combined Finance and Revenue Account. 


MINISTRY OF LABOUR AND EMPLOYMENT 


The Ministry of Labour has the following statistical units : 
(3) Labour Bureau (Simla) 

This organisation was set up in 1946 and its main functions 
are :— 
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(a) Collection and publication of data relating to labour 
and appraisal of statistical methods for adopting 
uniform and scientific techniques. 


(b) Maintenance of cost of living index numbers for 
selected cities. 


(c) Keeping uptodate the factual data relating to working 
conditions collected by the Labour Investigation 
Committee, 


This Bureau has conducted several ad hoc family budget 
enquiries. It compiles and publishes cost of living index numbers 
for a number of rural and urban centres, Its main publications 
are :— 


1—Indian Labour Gazette. (Monthly) 
2— Indian Labour Year Book. (Annual) 
3—Large Industrial Establishments in India. (Annual) 
4—Statistics of Factories. (Annual) 
5—Керогіѕ оп the working of : 
Indian Trade Union Act, 
Workmen’s Compensation Act, 
Minimum Wages Act, 
Employees State Insurance Corporation. 


Besides these regular publications, the Bureau publishes a 
number of ad hoc survey reports. 


(ii) Statistical Unit in the Department of Mines 


The unit collects data relating to labour employment in mines, 
wages, working hours etc. and publishes them in the Annual 
Report of the Chief Inspector of Mines, Indian Coal Statistics 
and monthly Coal Bulletin. 


(11) Statistical Unit Ministry of Labour (Agricultural Labour 
Enquiry Branch) 


This organisation was temporarily created in 1950-51 and 
was entrusted with the work of collecting data for All India 
Agricultural Labour Enquiry Committee. The Report of the 
Committee was published in 1955-56. Before it could be wound 
up the second Agricultural Labour Enquiry Committee was 
appointed. Its report too has been published. Now this unit is 
working more or less on a semi-permanent basis. 


16 AN INTRODUCTION TO MODERN STATISTICS 


(iv) Statistical Section of the: Directorate of Resettlement and 
Employment 


It collects data relating to employment exchanges, labour 
training and employment in Government establishments. This 
information is published in the ‘Handbook’ on training facilities 
available in the country. 


OTHER MINISTRIES 
(A) Ministry of Home Affairs 


This Ministry has under it the ‘Office of the Registrar 
General and Census Commissioner’. This is now a permanent 
office. It conducts the decennial population census, and publishes 
census reports. It is also entrusted with the work of collecting 
vital statistics, 

(B) Ministry of Heaith 


The Ministry of Health has а statistical bureau which 
compiles data relating to various problems associated with health 
and disease in India. Such data is published in the ‘Health 
Atlas of India’. 


(C) Ministry of Railways 


The Railway Board and the Office of the Economic Adviser 
to the Ministry of Railways publishes a wide variety of data 
dealing with railways. Monthly Railway Statistics and Annual 
Report of the Railway Board published contain a lot of statistical 
material. 


(D) Ministry of Transport 


The statistical branch in the Roads Organisation compiles 
data relating to different aspects of the road development. These 
statistics are published in the ‘Basic Road Statistics in India’. 


(E) Ministry of National Resources and Scientific Research 


Indian Bureau of Mines collects and publishes most of the 
mineral statistics of the country. 


Other Ministries also have statistical units attached to them 
for collecting and compiling statistics relating to their respective 
subjects. ; 


| 
| 
| 
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CENTRAL STATISTICAL ORGANISATION (C S 0) 


During the last two decades, and particularly after the 
independence, a number of statistical organisations were set up. 
'This made the statistical machinery of the Government of India 
diversified. Besides this, activities of various statistical Directo- 
rates in the States had also to be brought together. Thus еге 
was a need for a co-ordinating agency. This need was fulfilled: 
with the establishment of Central Statistical Organisation. The 
CSO was set up on 2nd May, 1951 as an attached office of the 
Cabinet Secretariat for the purpose of co-ordination of statistical 
activities of the various ministries of the Government of India 
and State Governments and for promotion of statistical standards. 
The activities of this Organisation have increased very much in 
the recent years. There is one Director, three Joint Directors, 
five Deputy Directors, nine Assistant Directors and two. Officers 
on Special Duty, besides а large army of statisticians, computers, 
compilers investigators etc. 


Functions 
The functions of the CSO consist of the following* :— 


1—To prepare and publish regular and ad hoc publications, 
such as the Annual Abstract of Statistics, Monthly Abstract 
of Statistics, the Weekly Supplement to Monthly Abstract 
of Statistics, Estimates of National Income, Sample 
Surveys of current interest in India, etc. 


2— To serve as a channel of communication with the U.N. 
Statistical Organisation, both with regard to observance 
of international conventions relating to economic statistics 
and provision of data required for the regular publications 
and for various ad hoc purposes. 

3— To represent graphically current statistics with a view to 
throwing light on the developing economic situation. 
4—To advise the Ministries and other Government agencies 

on statistical matters and arranging inter-departmental 
discussions on statistical matters. 
‚ 5—То co-ordinate the statistical work of the Ministries and 
other Government agencies with a view to eliminating 
and preventing unnecessary duplication and reducing the 
overall cost to a minimum. 


_ * Statistical Systems in India, C. O. L 


(2 ъв.—2 
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6—To develop definitions and standards for improving 
national and international comparability and to give 
continuing attention to the improvement of the quality 
of information required by Government. 


7—То keep in continuous touch with national organisations 
in other countries of the world in the context of the latest 
development in methodology as well as organisation. 


8— To undertake statistical work relating to planning. 


9— To estimate Annual National Income and conduct 
research in national income. 


10—To organise and conduct training courses in official 
statistics. 


11—To organise meetings or the conference of Central and 
State statisticians, Standing Committee of Departmental 
Statisticians, working parties etc. 


12— To assess the present position with regard to population 
Studies and demographic research. 


13—To conduct ‘middle class family living survey utilizing 
N.S.S. field survey. 


14—To undertake special work as and when it arises, whether 
it be at the request of the Central Government Ministries 
or of the State Governments. 


The purpose behind setting up the Central Statistical 
Organisation was to co-ordinate the statistics collected by various 
Statistical units in the centre and the states. Besides this co- 
ordinating function the CSO performs a number of other functions 
including collection, processing and publication ‘of data. The 
co-ordination work regarding data collected by the states is done 
through State Statistical Bureaux, which have now been established 
in every state. The Technical Working Party of the CSO assists 
in the examination of the statistical programmes of the states. Thus 
all State Statistical Bureaux are responsible for collecting statistical 
material relating to planning and development. On the advice of 
the CSO State Governments have appointed District Statistical 
Officers in every district so that statistics may be collected on a 
uniform pattern as laid down by the CSO. The GSO also helps, 
through its Technical Working Party, in the formulation of various 
schemes of the States for the development of statistics during the 


STATISTICAL ORGANISATIONS OF INDIA 19 


five year plans. The CSO sends a list of subjects on which statis- 
tical departments of the States are required to collect data. The - 
‘CSO also offers advice to the State Statistical Departments in 
терага to the various technical problems like technique of estima- 
tion of State income, building consumer price index numbers etc. 
The statistics collected by different Ministries are also co-ordinated 
by the GSO. Recently the Government of India decided that the 
functions of setting the items of work to be done by the Directorate 
of National Sample Survey in the field wing as also the details of 
«designs and other things like tabulation etc. should be undertaken 
by the CSO. Thus this organisation has now complete control 
«over the work which the NSS can do and also the manners in 
which it is to be done. Мом all schedules and draft designs of the 
‘various rounds of survey conducted by the NSS are examined by 
the CSO, which also approves and comments on the draft reports 
prepared by the NSS. 


Other Functions Performed By CSO. Some of the other 
functions performed by the CSO are as follows :— 


(2) Conducts studies relating to planning—The CSO provides 
‘economic intelligence to the Planning Commission, and the 
‘Government of India in statistical matters. The CSO regularly 
‘conducts studies of problems concerned with national planning. 
‘Such studies relate to the problems like demands for food-grains, 
sugar and cotton textiles during various plans, concentration of 
‘trade, studies about the size and allocation of investments, fixation 
-of physical targets for the plan etc. The CSO also undertakes 
techno-economic studies relating to the planning projects. A 
number of studies have been made about Bhakra Nangal Projects 
and Damodar Valley Corporation Projects. The Organisation also 
‘brings out progress report of selected projects every month. 

(ii) Conducts training programmes—The CSO has made 
‘arrangements for the training of statisticians. It provides courses 
for Senior and Junior Statistical Officers. Part time evening 
courses are also conducted for researchers and other persons engaged 
3n statistical jobs. 

(їй) Estimation of National Income—The CSO, publishes 
“every year a white paper on national income. The work of 
estimation of national income is done by the National Income Unit 
of the CSO. The CSO has also been carrying on exploratory 
work for the preparation of estimates of capital formation in the 
country, the statistics of which are utterly lacking in this country. 
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The CSO also helps the States statistical bureaux in the preparation 
of state income estimates. The CSO is trying to improve the 
national income statistics according to the suggestions made by 
National Income Committee, 


(iv) Publication of Industrial Statistics—The Industriak 
Statistics Wing of the CSO which is stationed at Calcutta, is 
responsible for Annual survey of industries, and also conducts 
special studies relating to labour cost, technological coefficients etc. 
The wing also looks after Cargo and Shipping Statistics at the 
request of Transport Ministry. The Directorate of National 
Sample Survey which works under the CSO also collects 
industrial statistics. 

(v) Publication of Statistical Information—The CSO publi- 
shes a large number of regular and ad hoc reports and bulletins. 
The important among them are :— 

Regular Publications :— 

1—Annual Statistical Abstract. 

2—Annual Survey of Industries. (till 1958 Census of Indian 

Manufactures) 
3—Monthly Statistics of the Production of Selected Industries: 
in India, . 
4—Monthly Abstract of Statistics. (Now it also includes: 
quarterly Review of Economic Trends in India) 
5—Weekly Supplement to monthly Abstract of Statistics. 
Ad Hoc Publications :— 


1—Statistical Handbook of Indian Union. ( 1958) 

2— Basic. Statistics of Indian Economy. (1958-59) 
3—Selected Plan Statistics, (1959) 

4— Statistical System in India. (1958) 

5— Sample Survey of Current Interest. (1958-59) 
6—Handbook of Statistics According to Reorganised States. 


7—Report on the Census of Central Government Employees. 
(1956) 


8—Report on the Annual С 
Statisticians, 


The CSO prepares a number of charts at the request of the. 
Central Ministries and some periodicals issued by Ministries also. 
take help from this Organisation. 


onference of Central and State. 


(vi) Supplies. statisticel information to the International 
Agencies—Supply of statistical information to United Nations: 


кА жы Жы 
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Statistical office and other International Agencies and non-official 
agencies in India and abroad is one of the main functions of the 
CSO. It provides statistical information under the - following 
heads :— 
1—Monthly data for United Nations. Monthly Bulletin of 
Statistics. ur. 
‚ 2—Monthly data for ECAFE Quarterly Bulletin; 
3—Quarterly Data for United Nations Bulletin’ of Commo- 
dity. Trade Statistics. PTS: MED э 
4—Annual Data for United. Nations Statistical Year Book. '' 
, 9—Annual Data for United Nations Demographic: Year Book. 
6—Annual Data for ECAFE Economic Survey óf Asia and 
Middle East, Вл ; 
7—Quarterly Data for Economic Intelligence Unit' of the 
London Economist. АБИ: 
8—Information about Sample Surveys condutted in India 
every year to the United Nations. ivi 
9—Ad Hoc information for (a) ‚U.S.S.R: Academy of 
Sciences and (b) Geographical Division of Encyclopaedia, 
U.S.A. { i b 
The CSO also sends explanatory notes on’ the scope, 
methodology, coverage and quality of Indian ‘statistics to the 
"United Nations Statistical Office and Economic Commission fòr ' 
Asia and Far East. que | BAS i ў 
: - (vii) Arranging Statistical Conferences and Committees — 
"The CSO arranges for statistical conferences and’ committees in 
India. It also sends its delegates, representatives ‘and sometimes 
observers to statistical conferences and committees held outside 
India, by international agencies. ME AWAIT о ho gu А 
(viii) Exhibition Hall—The CSO maintains an Exhibition 
Hall In this Hall exhibits displayed include some 500 charts, 
numerous graphs and a pictorial presentation. of :various aspects 
of the Indian Plans. of 
(ix) Advisory. Services to the Governments—The CSO 
renders miscellaneous services entrusted to it either by. the Central 
Government or State Governments or out of its own initiative 
for the improvement in the quality or quantity of collected data. 
‘The CSO also offers its services for the scrutiny and examination 
of all statistical schemes submitted to it for the purpose. State? 
statistical material is also examined by the CSO, with a view to 
improve its quality. 
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А Critical Appraisal of the CSO's Functions 


As regards the co-ordination activities, the CSO has by now 
examined a number of statistical schemes from almost all the 
State Governments, and has made specific suggestions for their 
improvement. On its recommendations, the State Governments. 
have strengthened their statistical machinery so as to cover all 
statistical needs of five year plans. The CSO also recommended. 
for the establishment of district statistical agencies in states and 
permanent field agency for conducting socio-economic surveys 
and enquiries. "The Organisation suggested detailed measures for 
the improvement of agricultural statistics in states and also 
institutional set-up for the training of statistical personnel. Thus, 
we see that the CSO has done a lot to improve the statistical 
material in the’ country. There has been qualitative and 
quantitative improvement in the Indian statistics. The Central 
Government has constituted an Indian Statistical Service on its 
advice. 

On the other hand, it is also felt in certain quarters that the 
CSO has centralised too much powers, and too much centralisa- 
tion may prove dangerous to the economy of country. The CSO 
has encroached upon the rights of various Ministries. Similarly 
the NSS has also encroached upon primary functions of State 
Governments in matters relating to collection of statistics. But 
at a time when reliable statistics are needed and when statistical 
agencies are not upto the mark, the centralisation of powers is 
not bad. Looking to what the CSO has done, we conclude that 


the organisation has done: кораны work in the field of 
Indian Statistics. 


BEATESTICAE ORGANISATIONS IN THE STATES 


Now all the states in India have one principal agency for 
collection, compilation and co-ordination of statistical data in the 
state. The same agency provides information and receives 
instructions from the CSO in matters pertaining to statistical 


improvement. The statistical organisations of the different states 
are as follows :— 


1—U.P.—Department of Economics and Statistics. 
2—West Bengal—State Statistical Bureau. 
3—Bombay—Bureau of Economics and Statistics. 
4—Assam—Department of Economics and Statistics. 
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S—Punjab— Department of Economics and Statistics. 

6—Bihar—Directorate of Economics and Statistics. 

7—Jammu & Kashmir— Planning and Statistics Department. 

8—Rajasthan—Directorate of Economics and Statistics. 

9—Madhya  Pradesh—Directorate of Economics and 
Statistics, 

10—Kerala—Directorate of Economics and Statistics. 

11—Mysore—Department of Statistics. 

12—Orissa—Bureau of Statistics and Economics. 


There are such. organisations in the centrally administered 
areas also. Besides these central statistical organisations in the 
States different departments like Agriculture, Industries, Planning; 
Labour and Sociz! welfare, Education, Animal Husbandry, Land 
Revenue etc. also have attached with them the statistical units. 


PRIVATE STATISTICAL ORGANISATIONS 


'The following private statistical organisations are at present 
working in the country :— 

1—Indian Statistical Institute, Calcutta. 

2—National Council of Applied Economic Research, New 

Delhi. 

3—Institute of Economic Growth. 

4—dGokhale Institute of Economics and Politics, Poona. 

5— Tata Institute of Social Sciences, Bombay. 

6—Universities of the country. 


AN APPRAISAL OF STATISTICAL SET UP IN INDIA 


The statistical set-up of the country, in spite of so much efforts 
has not reached a stage of perfection. In fact, much requires 
to be done as yet. But it is gratifying to note that rapid strides 
are being made in the right direction. There are following 
drawbacks in the statistical set-up of the country :— 

No General Statistical-Act—There are some acts in certain 
fields, but there is no general statistical act. There are such Acts 
like Industrial Statistics Act of 1942, Census Act, 1949. The 
Industrial Statistics Act was replaced by Collection of Statistics 
Act 1953. 'Thus, there are Statistical Acts in some fields, but 
there is no general statistics act in the country empowering the 
Government at the Centre or in the States to call for statistics 
in general Statistics therefore, still remain a by-product of 
administration. 
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Centralisation—It is felt that there is too much centralisation 
in the statistical set-up of the country. But in the initial stages 
the statistical organisations can only develop properly і а lot of 
initiative is taken from above. However, gradually, proper 
decentralisation may take place. 

Training and Research—In spite of a great headway, it 
has to be admitted that there is even now an acute paucity of 
trained personnel in statistics. This adversely affects the quality 
of statistics. 
‘(Lack of Private Organisations—For the development of 
unbiassed statistics it is necessary that private organisations are 
established. There are, so far, only half a dozen such organisations. 


СНАРТЕЕ 3 


AGRICULTURAL STATISTICS 


India is primarily an agricultural country. More than half 
of our national income is derived from agriculture. Thus 
adequate and reliable statistics relating to agricultural sector are 
of great use to the Government as well to others. All statistics 
having a bearing on agricultural economy may, be called. as 
agricultural statistics. Collection of agricultural statistics has been 
in vogue in India from very early times. In olden days, Kings 
and Shahas collected such statistics and maintained complete 
record of land acreage and production, so as to determine land 
revenue which was the principal source of revenue to the state. 
Kautilya's Arthas'astra and the A-in-i Akbari indicate the manner in 
which agricultural statistics were collected in those days. The 
British rule when established in India, gave due importance to 
the collection of agricultural statistics, as these were necessary 
for the collection’ ‘and determination of land revenue. The 
recurring famines and draughts in the latter half of the 19th 
century brought hore the imperative need of collection of 
agricultural statistics оп’ a scientific basis. In 1871, on the 
recommendation of Lord Mayo a new Department of Agriculture 
was created at the centre. But it was closed later on due to 
Afghan- War. In 1879, on the recommendation of the Famine 
' Corümission, the Central Department of Agriculture was revived, 
and agriculture departments were created in provinces lead by 
N.W.P.. (now U.P.).. А Statistical Conference was held at 
Calcutta in. 1883 and it was recommended to institute crop- 
forecasting for the important crops in India. In 1894, for the 
first time, there were crop forecasts for wheat and rice. Director 
General of Commercial Intelligence and Statistics, since its 
inception: in 1905 collected statistics regarding agriculture matters 
also. In 1948, all work relating to agricultural ‘statistics was 
centralised with the Directorate of Economies and Statistics, 
Ministry of Food and Agriculture, New Delhi. 
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Тре main object of agricultural statistics in India is 
administrative. They are collected primarily for revenue purposes. 
Agricultural statistics are also essential for orderly functioning. 
business and trade. In the present circumstances agricultural 
statistics are also required for assessing the food position of the 
country. The primary agricultural statistics may be studied under 
the following heads :— 


Area Statistics 


For a systematic and proper study of area statistics of India, 
the entire country may be divided into two broad categories viz. 
(1) temporary settled areas of U.P., Punjab and Madras, where 
land revenue has been temporarily settled and is subject to change 
at the time of next settlement and (2) permanently settled areas: 
of Bihar, Bengal, Orissa, and eastern part of the U.P. where land 
revenue has been permanently settled and is fixed for all times 
to come. 


I—Temporarily Settled Areas—The system of temporary 
settlement was introduced in the year 1892. In temporarily 
settled areas all the villages have been surveyed and mapped in 
detail. The village accountant called ‘Patwari’ or 'Lekhpal 
(also called Karnam ‘in the south, Karamchari in Bihar, Telathi 
in Bombay) is in charge of every village or a group of small 
villages and is required to make field to field inspection’ of the 
villages under his jurisdiction and to prepare statements showing 
areas under different crops and to submit to the revenue 
authorities. This method of collecting the area statistics by 
Patwari is known as the method of complete enumeration. 
Patwari keeps records of individual fields concerning crop sown, 
ownership etc. in a register known as Khasara. Total acreage 
under each crop is correctly known with the help of field 
inspection by Patwari. 


. H—Permanently Settled Areas—In permanently settled areas 
the land revenue is fixed for ever and it 15 not subiect to change, 
hence state is not very much interested in the collection of facts. 
In these areas, the Chowkidar or the Village Headman is 
entrusted with the task of collecting statistical facts. He also 
maintains certain statistical records, but he is untrained in this 
respect. Most of the statistical information of these areas is 
inaccurate as there is no supervisory staff to check and verify 
the entries of village headman. The senior officials of the revenue 
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department do not саге to scrutinie the figures carefully. 
Recently there has been a slight change in the existing practice 
of collecting agricultural statistics in these areas also, but the right 
approach to improve the situation in these areas would be to 
appoint an agency on the same lines as the one existing in the 
temporarily settled areas. 

Sources of error in area Statistics. The following are some 
of the weaknesses of area statistics :— 

(1) The inadequate equipment and inefficiency of Lekhpals— 
Lekhpal is the pivot around which the revenue administration 
revolves. But his work, so far as collection of statistics is 
concerned, is defective and far from satisfactory Many times, 
instead of actual field survey, information is sent by sitting at 
home. Checking of his work by senior officers is done in a 
routine manner. Low remuneration, lack of proper knowledge 
and training and overburdening with work on account of various 
other occupations make the lekhpal to do this work inefficiently 
and incorrectly. 

(2) Mixed crops—Wrong returns of area under different 
crops arise also on account of the fact that there is а practice 
to sow mixed crops and it becomes difficult to estimate the area 
under several crops. Recently the crop cutting survey method 
is being adopted in all the states to estimate the area under mixed 
crops. б 
(3) Uncultivated patches—The uncultivated patches within 
a sown area introduce some error in area-estimation. Though 
Lekhpal makes an estimate of that, but his estimates are simply 
guess estimates. 

(4) Non-survey of certain areas—There are certain areas 
which have not been surveyed, mapped and numbered. 
Gradually this drawback is disappearing. 

(5) ‘Lack of Uniformity—There is lack of uniformitv in 
definitions in different parts in the system of reporting. It is 
very often difficult to say whether acreage under crops means the 
area actually sown or the area successfully cropped. When a crop 
fails it should be excluded from the figures of area ‘under 
cultivation. 

(6) Inclusion of ridges—There is considerable over-estimation 
of area under cultivation due to inclusion of ridges and bunds 
in the measurement though they are neither sown nor cropped. 
A considerable area is covered under ridges in India due to 
fragmentation and sub-division of land holdings. 
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(7) Wrong practices—Area statistics are also unreliable 
because often estimate is made according to the quantity of seed 
sown. Lekhpal, too, due to fear of procurement of food-grains 
or to advocate diversion of acreage from food crops to non-food 
crops, supplies wrong information. His personal bias plays 
a great part in supplying information. 


Yield Statistics 


n 


In the modern times when trade in agricultural products 
Ваз’ assumed international character, the necessity of reliable 
statistics is of great importance. In India such statistics are of 
special help in estimating the food shortage in the country. There 
are following two methods which are in use for estimating yield 
from crops : 

(a) Traditional Method ; ; 

(b) Method of Crop-cutting experiments. 

(a) Traditional Method :—According to this method, yield 

is estimated by.the following formula :— 
Yield—Area Х Normal. yield X condition factor or seasonal factor. 
Normal yield is generally taken as the average yield on average 
soil in a year of average character. There is a great deal of 
confusion as regards the conception of normal yield. The normal 
yield is fixed for a quinquennium. It is subject to revision every 
five years on the basis of crop-cutting experiments. : 


The condition factor varies from year to year depending upon 
rainfall etc. The condition factor is determined by Patwari in 
terms of annas: That is why it is called annawari system, The 
normal: yield is taken equivalent to 16 annas, and on the basis 
of this, year’s crop is guessed as so many annas. The traditional 
method is clearly subjective and is open to the personal bias of 
the observer. | { "Ft 

(b) Method of crop-cutting experiments :—The method of 
crop-cutting experiments for the estimation of yield statistics is 
more scientific and is an improvement over the traditional method. 
This method is based on Dr. P. V. Sukhatme's plan. This method 
is an application of Random Sampling Technique, and was 
adopted in the National Sample Survey. This method is now 
adopted throughout India with the help of statistical experts of 
the Indian Council of Agricultural Research and of the Indian 
Statistical Institute. According to this method, certain number 
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of villages are selected in each tehsil and within each village 2 
or З fields are chosen by random sampling method. In the selected 
plots the crop is cut, threshed, winnowed and weighed immediately 
and due allowance is made for moisture on the basis of special 
experiments conducted for the purpose. This method is far 
superior to traditional method. 'The average yield is directly 
obtained. The method is objective and the extent of error is 
ascertainable. 


Statistics of Land Utilisation 


Statistics of land utilisation are available in India since 1884, 
though their coverage has been gradually expanding. These 
statistics are published in Volume I for the States and in 
Volume II for the districts of the ‘Indian Agricultural Statistics’ 
issued by the Directorate of Economics and Statistics, Ministry of 
Food and Agriculture, Government of India. Land utilisation 
statistics are available for 87% of total geographical area of India. 
Land utilisation statistics are published in nine sub-classes according 
to actual use of the land as follows :— 

1—Forests—This includes both state owned and privately 

owned forest area. 

2—Land put to non-agricultural uses—It includes all land 

occupied by buildings, roads and railways or under water, 
that is rivers, canals and other lands put to use other than 
agriculture, 
3—Barren and uncultivable land—It includes all barren and 
uncultivable land like mountains, deserts etc. 

4—Permanent pastures and other grazing lands—It includes 
all grazing land whether constituting permanent oma 
and meadows or not. 
5—Miscellaneous trees and groves. not included in the net 
area sown—It includes all cultivable land not included 
under net area sown like grass, bamboos, bushes etc. 

6—Cultivable waste—It includes land available for cultivation 
but not cultivated, may be fallows or jungles not put to 
any use, 

7—Current fallows—It includes cultivable land where crop 

is not sown in the current year ie. fields left unused to 
recoup the lost fertility. 
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8—Other fallow lands—This includes the land which was 
previously used for cultivation but left unused for the last 
five or more years. This may be due to poverty of the 
cultivator, inadequate supply of water, malarial climate, 
silting of land, soil erosin, unremunerative farming etc. 

9—Net area sown—It includes area sown with crops and 
orchards. Areas sown more than once are counted only 
once. 


Crop Estimates 


'The crop estimates originally known as crop forecasts are 
issued 70 times for 30 crops divided into six major groups as 
shown below :— hi 

' (а) Cereals—Rice, jowar, bajra, maize, ragi, small millets, 

wheat and barley. 
(b) Pulses—Gram, pulses (tur), other kharif pulses and | 
other rabi pulses. Ih 
(c) Oilseeds—Groundnut, sesamum, rape and mustard, 
linseed and castor seed. 


(d) Fibres—Cotton, jute, sann-hemp and mesta. 
(e) Plantations—Tea, coffee, rubber and coconuts. 


(f) Miscellaneous—Sugarcane, potatoes, в, реррег, 
ginger and chillies. 

The crop estimates are published with the object of making 
available an early indication of the size of the crop before it is 
actually harvested. The usefulness of such statistics depends upon 
their publication well in advance of the time when the crops 
are harvested. With this end in view, several estimates are made 
for each crop at regular intervals. Generally three estimates are | 
published, but for certain crops like wheat and cotton more than 
three estimates are published. The object of the first estimate, - 
which is issued one month after the crop is sown, is to give an _ 

` indication of the size or area ОЁ the crop and weather 
conditions at the time of seedling. This enables the Government | 
and the merchants to have an idea of the probable character of | 
the crop. The second estimate which is published two months | 
after the first, gives detailed information regarding the area sown, 
the probable character of the harvest and expected yield in 
case of some crops. The final estimate gives details of total 
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sown and the quantitative estimate of the outturn harvested or 
expected to be harvested. Remarks on price, exports etc. are 
also often included in the final estimate. These estimates are 
regularly published in the ‘Estimates of Area and Production of 


Principal Crops in India' issued by the Directorate of Economics 
and Statistics. 


Food Statistics 


Food statistics are published in the annual Bulletin issued 
on Food Statistics by. the Directorate of Economics and Statistics 
of the Ministry of Food and Agriculture. Information regarding 
surplus or deficit of food-grains, import of food-grains etc. is 
given in that publication. 


Livestock Statistics 


India is predominantly an agricultural country and its 80 per 
cent population lives in villages and is connected with the rural 
economy. Under such circumstances cattle wealth occupy ап 
important place. The facts concerning cattle wealth are of great 
use in the country. In our country the statistics of livestock were 
first collected at the instance of the Secretary of State for India 
and it was in the year 1883 that the Statistical Conference 
prescribed a form on which details of cattle census were to be 
filled. Since then figures of livestock began to be published 
«quinquennially in Agricultural Statistics of India. It was in the 
year 1916 that the Government of India decided to improve the 
‘situation and to have a cattle census for the whole of the country. 
A cattle census was held in the year 1920 and since then it is 
conducted every fifth year. The data collected through the Live- 
stock Census are published in the Indian Livestock Census. The 
publication also contains figures about production of milk, butter, 
ghee, meat, eggs, hides and skins, foreign trade in livestock and 
utilisation of livestock products etc. It also contains data about 
agricultural implements and machinery of different types including 
tractors in various states. 


The livestock statistics are not supposed to very satisfactory. 
But there has been gradual improvement both as regards coverage 
‘as well as quality of data. In 1956 certain improvements were 
made and for the first time the census was conducted on a household 
‘basis on а uniform proforma. A sample verification was also 
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done by the NSS in June-July 1956. The Indian Council of 
Agricultural Research has also done sample verification. 


Forest Statistics 


About 22% of the land area of the country is covered under 
forests. Forests statistics are published in the ‘Indian Forest 
Statistics’ which is issued by the Directorate of Economics and 
Statistics of the Ministry of Food and Agriculture. This bulletin 
contains information regarding :— 


l—Area covered under forests is classified according to . 
ownership, legal status and composition of timber. In 
respect of ownership, forest areas are divided into those 
owned by the State Governments, Civil Authorities, cor- 
porate bodies and private individuals. According to legal 
status forest areas are classified. into reserved forests, 
protected forests and unclaimed forests under each type of 
ownership. Forest areas according to timber composition 
are classified into (i) conifer forests and (ii) broad leaved 
species further classified into sal, teak and other varieties. 

2— Volume of standing timber, firewood and their increment 
in exploitable forests, ‘Indian Forest Statistics’ provides 
Separate figures for different types of available timber 
along with gross annual increment, natural loss, annual 
falling and net increment etc. 

3—Out-turn of timber and other minor produce. 

4—Number of persons employed. 

5—Revenue and expenditure statistics of Forest Departments, 

6—Foreign trade in forest produce. 


' 


Defects ОЁ Indian Agricultural Statistics 


Indian agricultural statistics suffer from a number of defects 
and shortcomings. Due to these defects the reliability of such data 
is ig ROARS The Foodgrains Enquiry Committee 1957 
observed, “that the available statistics regarding foodgrains were 
very unreliable, we took special care to go into the question.” The 
Committee further stated, “that statistics of food production by 
their very nature cannot be completely accurate and they can 
never be as firm as statistics of, say, industrial production or bank | 
credit or money supply. But the quality and coverage of food | 
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production statistics in India are gradually improving and they 
are now much better than what they were ten years ago. The 
improvements in. the collection themselves have introduced, 
however, certain element of incomparability between production 
figures of recent years and those of earlier years.” Prof. D. М. 
Elhance observes, "Our yield statistics are yet very defective and 
we do not know still how much do we produce whether it is 
enough for our own requirements, whether we can export if there 
is а surplus and how much to import if there is a deficit. This 
is a very unsatisfactory state of affairs and at a time when we 
are having a planned economic development this lacuna is really: 
one which deserves all attention. “While addressing the annual 
meeting of ‘Indian Society of Agricultural Statistics’ in New Delhi, 
Dr. P. S. Loknathan very rightly observed, “the defects in agri- 
cultural statistics are many. Complete absence of data on some 
important topics is a matter of serious consequence. Coverage is 
insufficient and data are deficient, because of lack of uniformity 
in definition and classification they also suffer from non- 
comparability over space and time. The divergence in figures 
supplied by various agencies leads to defects in planning and co- 
ordination. The method of presentation of data is also at times 
defective. In addition there are defects of tabulating and 
processing. These shortcomings together with the delay in 
collection and publication of the data diminish their utility to 
both the Government and public" The defects found in the 
agricultural statistics of the country, though are in the process 
of being remedied gradually are :— 


1—Gaps in the coverage 


Though the position has considerably improved now, still, 
there is much to be done in this respect. There are gaps in the 
coverage of agricultural statistics. Such gaps are of three types— 
(i) gaps in the coverage of area—available agricultural statistics 
rélate only to 87% of the total area of the country. For the 
rest approximate and conventional estimates are given. These 
areas are in the states of Jammu and Kashmir, Rajasthan, Assam, 
and Gujrat. These areas are backward and in certain cases 
unapproachable also. Certain parts in the reporting areas and 
most part in the non-reporting areas have not been surveyed and 
this constitutes the main difficulty in the way of establishing proper 
statistical organisation in these areas. (ii) gaps in the coverage: 
of crops—No information is available of the production of certain: 
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crops like fruits and vegetables, minor cereals fodder etc. Gaps 
in the coverage of crops limit the scope of agricultural statistics. 
(iii) gaps in the coverage of items—Information of certain items 
pertaining to agriculture and having a direct bearing on it is 
inadequately available. Cultivators’ holdings, livestock, indebted- 
ness are a few examples. 


2—Lack of uniformity in definition, classification and 
technique 


Definitions of various statistical units are not uniform 
throughout the country. Procedure and methods of computation 
also differ, for example :— 

(i) Area statistics are collected differently in temporary 
and permanent areas and field to field inspection are 
not undertaken in permanent areas because of absence 
of local agency. (except in Bihar and Orissa) . 

(ii) The classification of area of agricultural statistics in 
India was not uniform because definitions differed. 
Uniform definitions have now been adopted. 

(iii) Methods of finding average yield also differ from 
state to state. Generally it is Normal Yield Condition 
Factor, but in Punjab it is directly estimated in terms 
of maunds per acre. 

(iv) The anna equivalent of normal crop is not the same 
in all the states. Different states follow different 
rules. In some states 16 annas represent normal crop 
while in others 13 or even 12 annas serve the purpose. 
Sometimes number of annas representing the normal 
crop differs from place to place, even in the same state. 

(v) Definition of harvest price is not uniform, because in 
some states it means average wholesale’ price at 
important markets, while for others it means retail 
prices there. y 

(vi) Methods of recording areas under mixed crops and 
under bunds and uncultivated patches is also different. 


3—Defects of Tabulation and Processing 


The classification and tabulation of agricultural statistics are 
not done satisfactorily even at present. This leads to criminal 
waste of human energy, time and public funds. Most of the 
information collected by the primary reporting agency at the 
village level runs waste for want of proper statistical treatment. 
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As regards classification, many times it is not satisfactory. For 
example, the new classification of land utilisation statistics does 
not give any idea about the suitability of land for various purposes. 
This classification does not take into account economic or other 
considerations without which any classification becomes less useful. 
There is much to be done in the field of proper classification and 
proper processing of the agricultural statistics. 


4— Defects of primary agency 


'There are certain defects in the primary reporting agencies. 
Though agricultural statistics in the temporary settled areas are 
more reliable than that of permanently settled areas, but there 
remains enough scope for improvement there too. Most of the 
defects in the primary reporting agency are due to (i) heavy 
burden of work on Lekhpals and their participation in extra 
routine work, (ii) low remuneration paid to them, (Ш) lack of 
adequate training for handling their job and (iv) neglect and 
bias of the primary reporting agency. 


5— Defective supervision and inspection 


The work of Lekhpal is checked by the Kanungo, Naib- 
tehsildar and tehsildar. These officers are burdened with heavy 
administrative duties hence they do not devote due attention to 
the work connected with collection of statistics. In fact the 
entries of-the lekhpal should be checked on the basis of random 
sampling. It is gratifying to note that some State Governments 
are now insisting’ on better supervision and checking’ of the work 
of the primary reporting agencies and are also gradually following 
the system of random sample checks on the work done by lekhpal. 
Schemes: of central supervision and checking have also been 
introduced. by the Government: of India recently: However, 
supervision work done by these agencies has not been found to be 
of a high quality. 

6—Defective Co-ordination 

Another defect in the present system of agricultural statistics 
is the lack of proper co-ordination and planning. In various 
States, Departments of Agriculture, Food and Civil Supplies etc. 
collect agricultural statistics independent of each other and these 
data are not co-ordinated. There is duplication of work. 
Various Departments conduct ad hoc surveys regarding the same 
problem independent of each other, resulting in duplication of 
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work, wastage of time and energy. Now attempts are being made 
by the State Governments to co-ordinate the statistical work. of 
the different departments. The CSO also co-ordinates the 
statistical work of the various statistical organisations of the states. 

7—Delay т publication 

Undue delay in the publication of agricultural statistics 
belittles the very aim of collection of statistics. Various committees 
and commissions appointed from time to time have repeatedly 
pointed out this defect and have recommended publication of 
statistics in time. India is a big country and in various states 
the time of sowing of various crops are different. Process of 
collection is also very long. Figures are first collected at village 
level by the lekhpal, then they are collected at tehsil level, then 
at a district level and then at state level and finally co-ordinated 
by central authority where they are finally processed and published. 
In this long chain, if figures are delayed at any stage, the delay 
multiples. In order to minimise the delay, the red-tappism in 
the statistical work should be given up. The district statistical 
offices should have direct contact with the central co-ordinating 
agency. 

8—Non-availability of statistics relating to certain items 


Even at present there are a large number of items about 
which no data are available. India being an agricultural country 


it is necessary that statistics should be collected about various , 


characteristics of our rural economy. Techno-economic surveys of 
rural areas should also be conducted on a large scale. 


Steps to remove defects of Agricultural Statistics 


The national Government from its very beginning have paid 


due attention to improve the statistical material of the country ' 
both qualitatively and quantitatively. In 1949 an Agricultural : 
Statistics Co-ordination Committee was appointed: under the: 


chairmanship of Shri W. В. Nathu. The Committee examined 
the defects of the Indian Agricultural Statistics. The Committee 


made a number of valuable recommendations on 25th September , 


1953, in the Conference of States Agricultural Ministers, which 
was held under the chairmanship of Dr. P. S. Deshmukh, the 
then Agricultural Minister of the Government of India, a number 
of suggestions were made in the direction of improving the 
agricultural statistics. The National State Agricultural Tntelli- 
gence Board in its meeting held on 29th May, 1962, recommended 
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to the states that steps should be taken to improve the primary 
reporting agency and surveying of land should also be done. The 
states Governments were also assured of financial help from the 
Centre. As a result of these recommendations by various 
Committees and Conferences, there has been some improvement 
in the quality and quantity of agricultural statistics. There has 
been improvement particularly in the following directions :— 
1—Extension in the Goverage :—About 8796 of the land area 
has been covered and the steps are being taken to cover the remain- 
ingareas also. The number of commodities for which Crop Estimates 
are made has also increased. The statistics of land utilisation 
and their classification have also improved. Тһе statistical organi- 
sation has also been strengthened by appointing a District 
Statistical Officer in each district. There has been much 
improvement in the primary reporting agency. Statistics on 
various new topics are also collected. But in spite of all these 
improvements much is left still to be done. Tt is necessary that 
statistics should be collected on the problems like : (i) production 
cost and levels of agricultural income, (ii) Agricultural un- 
employment in different states, (11) rural indebtedness and effect 
of rising prices on the rural economy, (iv) resources and necessities 
of different groups of agriculturists, (v) Extent of savings and 
investments of the agriculturists, (vi) Agricultural. wages, (vii) 
incidence of taxation in the rural areas, (viii) extent of leased 
agriculture. Dr. S. R. Sen has suggested that statistics should be 
collected regarding :— 
(i) Fisheries livestock products. 
(ii) Number and distribution of holdings according to size 
either according to ownership or to operational Unit. 
(iii) Holdings to be related to livestock and form population. 
(iv) Statistics of area under improved agricultural practices. 
(v) Extension of random sampling method of crop-cutting 
to commercial crops. 
(vi) Land classification according to Land-use, capabilities 
and potentialities. 
(vii) Data on costs and returns to be built up. 


СНАРТЕЕ 4 


NATIONAL INCOME ESTIMATES 


*We have been invited to consider the materials available 
for estimating the national income and wealth of India. These 
materials are very defective. To put it briefly, the statistics of 
crop production leave much to be desired, while statistical 
information about other important parts of agricultural income 
such as the output of animal husbandry are almost completely 
lacking and statistics of industrial production are patchy in the 
extreme.”  (Bowley-Robertson Committee Report). These re- 
marks were made їп the year 1934 by the two. well-known 
economists of England, but no scientific measures were adopted, 
till it was very late, by the Government of India, to estimate the 
national income of the country. 


Early attempts at the Measurement of National Income in 
India—The calculation of per capita national income is not an | 
end itself, but simply a means to an end. Its importance in 
planning future economic policies cannot be overemphasized. 
Every sort of economic planning must necessarily aim at increasing 
the per capita national income. It also serves as a measuring 
rod for comparing the general economic conditions of the people | 
in different countries or in the same country at different times, 
and changes in the various components of national income reflect | 
upon the changing pattern of economic set-up. But despite of 
its playing such a vital role in the economic life of a nation, no | 
systematic and scientific attempt was made uptill 1931-32. No 
doubt there are not less than eleven estimates. It is said that 
‘no statistics are better than bad statistics’, but it may be preferable 
to say that ‘some statistics are better than no statistics’. 


Starting from the pioneering work of Dadabhai Naoraji, the 
grand old man of India for the year 1867-68, three attempts o 
estimating the national income of India have been made 
the last three quarters of a century. Of these attempts, som 


a 


NATIONAL INCOME ESTIMATES 39 


are quite comprehensive, whereas others are mostly in the form 
of some general notion on per capita income. The names of the 
various estimators suggest that the foremost of India's intellectuals 
and administrators were drawn towards the problem of measuring 
the sum total of economic activities in the country. This was not 
just the result of the pull exercised by intellectual curiosity alone. 
The results produced by them served as valuable ammunition in 
the developing battle between nationalist ideas and foreign rule. 
The figures on per capita income, this handy expression of а 
country's life into easily remembered numerals was used by people 
on both sides of the battle to prove either the disastrous conse- 
quences of British rule over India or to impress upon the 
unbelievers the supreme benefits flowing from it. 

The results of past estimates of National Income of India are 
summarized below :— 


Year for | 


No. Estimated by which | Сар | 
ic pita ] 
estimated attempt. | yo come 96 of the | Be India 
1. | Dadabhai Naoroji* 1867-68 77 x 
2. | Baring and Barbourt | 1882 67 » 
3. | Lord Curzon 1897-98 67 » 
8. | W. Digbyt 1899 64 ” 
5. | Е. С. Atkinson 1875 55% P 
6. " » 1895 57 ” 
7. | Sir В. №. Sharma 1911 — w 
8. | Wadia & Joshi 1913-14 — Whole of 
India 
9. | Vakil and Muranjan | 1910-14 = Br. India 
10. | F. Shirras 1921 = 
Whole of 
11. | Shah and КһатЬһа!а$| 1921 89 India 


The first attempt to estimate the national income of India 
was made by Dadabhai Naoroji in his well-known book “Poverty 
and British Rule in India. His estimates were based on the 


* Includes material production and асоба most of services. 
н АП agriculture i income assumed to be % of the agriculture income. 
шше income assumed to d a fixed relationship to land revenue 
roduction excluding services. 
ES production excluding services. 
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official statistics relating to the year 1867-70. The net result 
which one can derive by looking at the summary of the major 
estimates was that even when the separate attempts related to 
the same year or to not too distant periods, the difference between 
the estimates of per capita incomes were very wide indeed. For 
instance, Lord Curzon's estimate of per capita income of British 
India for 1897-98 was two-thirds higher than William Digby's 
estimate for the following year. Apart from personal predilec- 
tions of the various estimators, there were also important 
conceptual differences among them which made their results very 


divergent. Similarly the estimate of per capita income made by ` 


Baring Barbour was a third higher than that of Dadabhai 
Naoroji’s, mainly because Baring Barbour attempted to cover 
services whereas Naoroji left them out on principle. 

While making comparison of these estimates, one has to keep 
certain precautions in his mind. These estimates were in current 
prices and relate to different dates. Therefore they are incom- 
parable without adjustment for price changes. Another fact to 
be remembered is that the area covered by the computers is not 
the same in every case. We must allow further for the differences 
in the methods adopted in the enquiries. Again there is difference 
of treatment arising from divergent views as to the constituent 
elements in the national income. For example, Shirras estimated 
the, income of the professional class in the total while they are 
deliberately excluded in some of the estimates. In order to make 
comparisons between the results of enquiries relating to two 
different periods, we must not take the actual figures as they are 
given but as they would have been if the method adopted had 
been identical. There is one more point to be noticed, that the 
later valuations are more correct than previous ones, for the 
simple reason of the gradual development of statistical methods 
in India. In considering the results of any particular enquiry we 
must take into account the spirit and purpose underlying it. 
Some of the investigations have been inspired by a spirit of 
political controversy. 

Although the various estimators used divergent approaches 
for estimating non-agricultural income, they all had one thing 
in common. All of them with the notable exception of W. Digby 
used roughly the same procedure to estimate income originating 
in agriculture. The estimates of agricultural output were based 
on the data furnished by the annual series. "Estimates of Area 
and Production of the Principal Crops," which began to be issued 
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-in the closing decades of the last century. These estimators have 


relied heavily on official estimates of crop forecasts, area and 
yield for deriving agricultural incomes. Considering the nature 
of the data handled and the methods that had to be devised, it 
is quite probable that the estimates of crops suffer from various 
weaknesses and may understate or overstate the actual output 
in many instances. 


The problems involved in estimating non-agricultural income 
were complex indeed. Some of the estimators (Baring Barbour 
and Curzon) had a good but not very convincing solution to 
this. Since the population engaged in all other non-agricultural 
occupations was half of that in agriculture, they simply took all 
non-agricultural incomes also to be one-half of their estimates 
of the value of agricultural output. Others (D. Naoroji, Digby 
and Shah and Khambhata) attempted to estimate the value of 
the output of large-scale factory industries and of small-scale 
cottage industries, but excluded all the services. Professor K. T. 
Shah and Khambhata set out at length their reasoning for the 
exclusion of services. They stated, “All services have to be and 
are rewarded ultimately from the same dividend (yearly total) 
of material commodities produced in a nation. When we have 
measured the material commodities, we must necessarily be taken 
to have included also the services—not only those which are 
actually, obviously directly involved.in the production of those 
commodities, but also those which are ancillary or incidental to 
that production (such as the government official or the soldier) .” 
“There is not a shred of excuse to speak of the non-industrial 
services (ie. services which do not result in material production) 
as being measurable in money. They result only in such utilities 
as advice, knowledge, guidance, pleasure, comfort, relief from pain 
etc., which being psychic are non-measurable. These, therefore, 
though a species of income in the broad sense, cannot enter into 
a computation of the national dividend." Their. estimates, 
formulated. on the basis of the physiocratic concept of material 
output, find a close conceptual parallel in the estimates prepared 
in U.S.S.R., Eastern European countries and Republic of China 
of what is called the net national product. Those who attempted 
to estimate the income from small enterprises, commerce, transport 
and other services faced a difficult task. In this quest for 
comprehensiveness each. of the estimators devised his own 
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ingenious methods to attain perfection ; the result of this highly 
individualistic guess-work differed widely making it impossible to 
piece all their efforts together to provide some over-all long-term 
view. 


However convincing the results of one or the other estimator 
might have been to the people of that age, they remain singularly 
uninformative in understanding the long-term trend in economic 
growth in the country. None of the estimates is in a form that 
can provide even an approximate answer to a simple question 
such as: Has the real per-capita income increased or not over 
the last 75 years or so? If even a general answer to this question 
is not possible, it is in vain to ask for more specific replies ; at 
what rate has it changed over these decades ? 


This is in no way to detract from the merits of these studies, 
for each of them no doubt contributed a great deal in building 
up a body of knowledge about various branches of economic 
activity in the country. In spite of discrepancies in results obtained 
by the different investigators one fact which clearly stands out 
from all of them is that, inhabitants of this country are poor 
and the national income is very unevenly distributed. Though 
the picture presented by them was a static one, our knowledge 
of the anatomy of the country’s economy would have been 
considerably more imperfect without them. 


The First Scientific attempt by Dr. V. К. В. V. Rao :— 
Dr. (Mrs.) Vera Anstey remarked, “Mr. V. K. R. V. Rao has 
made the most satisfactory assessment so far available of the 
national income of British India. But this estimate despite the 
ingenuity of Mr. Rao still partakes of the nature of a good guess 
rather than record of facts.” ‘The work of Dr. Rao was most 
timely one, as in the war time the national income estimates of 
a country has a great importance. Before proceeding to actual 
examination of his estimate, it is essential to see how far his 
definition of national income coincides with the net national 
income conception. Now national income means the value of the 
output of goods and services produced in twelve months excluding 
what is required to maintain and replace national capital. 
Dr. V. K. R. V. Rao defines the national income as follows : 


“The national income of a country is the money value of 
the flow of commodities and services excluding imports becoming 
available for sale (or capable of being sold) within the period, 
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the value being reckoned at current prices minus the sum of the 
following items :— 


(a) The money value of the flow of goods and services 
used up in the course of production. 


(b) The money value of any diminution in stocks that 
may have taken place during the period. 


(c) The money value of the flow of goods and services 
used to maintain intact existing capital equipment 
(value being reckoned at current prices in all these 
cases). 


(d) Receipt of the State from direct taxation. 


(e) Favourable balance of trade including transaction in 
treasure. 


(f) Net increase in the country's foreign indebtedness or 
ihe net decrease in the holdings of balances and 
securities abroad whether by individuals or the 
Government of the country. 


The first part of the above definition corresponds to. gross 
income and the second clause enumerates the deductions necessary 
to reduce it to net income." 


The main part of Dr. Rao's definition is that he rightly 
includes goods and services and emphasises the importance of the 
difference between net and gross national income. Не has profited 
greatly from a careful study of the works of his predecessors, 
which have shown him numerous pitfalls that await the investi- 
gator in a country in which statistical data leave so much to be 
desired as they do in India. No census of industrial production 
was taken upto that time, and there was uncertainty in data 
regarding agricultural activities of the country. But still Dr. Rao 
claimed greater accuracy аз compared to previous estimates on 
the ground that he has supplemented the available statistical 
material by a number of ad hoc enquiries in respect of certain 
items of the total of the national income. Не made an estimate 
for the year 1931-32 a census year. According to his calculations 
the per capita national income was Rs. 65/- with a margin of 
error of + 6%. The details are as follows :— 
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Value in 


Description Million of oe 
Rs. 

Value of Agricultural output 5927 = 
Value of Livestock Product 2683 + 10 
Value of Fishing and Hunting 120 + 20 
Value of Forest Products 92 — 
Value of Mineral Products 180 — 
Income assessed to income Tax 2161 — 

Income not assessed to income Tax :— 
(i) The workers engaged in industry 2100 +17 
(ii) Services of Railway and P. & T. 590 — 
(iii) Workers engaged in trade 1233 + 15 
(vi) Professions and liberal Arts 416 4515 
(v) "Transport other than railways 283 + 20 
(vi) Domestic Services 325 + 20 
Miscellaneous 780 + 10 
Total 16,890 + 6 


———————————— 


The sources from which Dr. Rao gathered statistical informa- 
tion may be divided into three. First, he had recourse to all 
published official and semi-official literature, that threw the least 
light on the economic conditions of any sector of our population. 
He examined village studies, working class budgets reports and 
evidences of the. Royal Commissions on Agriculture and Labour 
and the Banking Committees and a number of agricultural and 
industrial. bulletins ‘and monographs published by different 
provincial governments. By this recourse Dr. Rao was able to 
gather some quantitative data on such subjects as milk yields, 
consumption of grain by cattle, value of agricultural implements, 
distribution of holdings and earnings of urban and rural artisans. 
The second source was ad hoc enquiries of his own into the quality 
and the value of the output of livestock products and the earnings 
of some important classes of non-agricultural workers. Third 
source was the replies received of the letters addressed on subjects 
such as wages, milk yield, cattle feed rates, yield of special ctops 
etc. to the chief officers of different governments in charge of 
departments touching: his enquiry. : 15-5555 
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Now. let us examine his calculations in a bit detail. He 
calculated the yield of the agriculture on the basis of the following 
formula : 

(Standard yield X state of the crop for the year expressed 
in terms of percentages of the standard yield X cultivated area 
under the particular crop). 

But there were elements of varying degree of uncertainty in 
all the three factors, that enter into the estimate of agricultural 
production which constitutes over half the total income. The 
statistics of the area sown with the various crops reached a high 
degree of accuracy except for the permanently settled areas, which 
had then no suitable reporting agency. The standard yield was 
based on the figures compiled for Famine Commission, brought 
uptodate by crop-cutting experiments. The weakest link in the 
formula was the percentage of the standard yield. This was worked 
out from the reports of village officers who for reasons deeply 
embedded in India’s part, take a notoriously pessimistic view. 
Dr. Rao thought that individual errors cancelled themselves out. 
But Dr. Rao supplied the necessary corrections for cotton and jute 
crops, but was not able to do so for other crops for which it was 
equally necessary. His estimates of agricultural production must 
therefore be regarded as an under-estimate. 

The next problem is that of valuation of such produce. 
Dr. Rao valued it by the average harvest price ruling in the market. 
But for the ideal valuation in a country like India where 
agriculture is so important is to obtain the harvest prices actually 
received by the farmers. Because in India the farmer receives 
much less amount than what consumers pay. Secondly he 
excluded in his total the amount of produce which was bartered 
or was kept by the producers for his own consumption. This is the 
chief defect in his total. In the case of the total of the incomes 
assessed to income tax, incomes of state employers and the value of 
the miscellaneous items are the products of the estimated numbers 
of employed earners and their average earnings. In these cases 
errors in each of the two factors in the product are additive. 

Dr. Rao arrived at a national income figure of Rs. 65/- per 
capita for the year 1931-32. Mr. R.G. Desai tackled the problem 
by the other side and he computed Rs. 82.5 per capita expenditure. 
Dr. R. G. Desai’s estimate of per capita consumer expenditure (this 
excludes net public and private capital formation and free public 
services to the ultimate consumer) for 1931-32 was a-third higher 
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than Dr. Rao's estimate of per capita income for the same year. 
Consumer expenditure falls short of national income by (a) 
government expenditure less indirect taxes and (b) net capital 
formation. We cannot say that either of these was negative. 
National income figure therefore should have certainly been higher 
than Rs. 82.5. Therefore Dr. Rao’s figure seems to be а 
considerable under-estimate. ; 

In spite of all such criticism it is an admitted fact that this work 
dealt with an important problem of the national economy. In spite 
of his difficulties he had been able to break the path. His estimates 
reflected the poor economic conditions of the masses. 

Official Estimates of National Income. Though Ministry 
of Commerce of the Government of India used to estimate the 
national income of the country, chiefly for the purpose of supplying 
information to the U.N.O, but the serious steps in this direction 
were taken when a permanent National Income Unit was esta- 
blished and National Income Committee was appointed. 

The National Income Committee was appointed by the 
Government of India by a resolution number 15 (33) —P/49 dated 
4th August 1949, which reads as follows :— 

“The Government of India have been giving consideration for 
sometime to the inadequacy of the factual data available for the 
formation of economic policies. One important gap in the absence 
of authoritative estimates of the national income amd its various 
components. "The Government of India have accordingly decided 
to set up a committee to advise how best this gap could be filled 
up. The terms of reference of the Committee are to prepare а 
report оп the national income and related estimates, to suggest 
measures for improving the quality of the available data and for 
the collection of further essential statistics and to recommend 
ways and means of promoting research in the field of national 
income. The Committee will also guide the ‘National Income 
Unit of the Government of India to compile authoritative 
estimates of the national income.” 

The Committee was constituted with Prof. P.C. Mahalanobis 
as its Chairman and Prof. D. К: Gadgil and Dr. У. К. К. V. Rao 
as its members. Dr. В.С. Desai acted as its Secretary, but he 
resigned from his post on 25th December 1949 and Mr. M.M. 
Mukherjee was appointed in his place. Arrangements were also 
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made to secure advisory help of Prof. Simon Kuznets of 
University of Pennsylvania, Prof. J.R.N. Stone of University of 
Cambridge and Dr. J.B.D. Derkson of the U.N. Statistical Office. 
The Committee submitted its First Report in April 1951 and Final 
Report in February 1954. 

The method adopted by the Committee was the same as 
adopted by Dr. V.K. R.V. Rao in his estimates. This Committee 
also combined the Inventory and Incomes. methods for want of 
statistical data. The inventory method was applied in estimating 
income from : 


(a) exploitation of animals and vegetation including animal 
husbandry and fishing, 

(b) exploitation of minerals, 

(c) industry. 

The Incomes Method was applied for estimating the 
contribution of Transport, Trade, Public Force, Public administra- 
tion, Professions, Liberal Arts and Domestic Services. 

The estimates presented by the Committee were generally 
based on the regular official statistics both Central and State. The 
Committee also niade use of various unpublished material 
available in the Ministries and States. Thus level of accuracy 
and comprehensiveness of estimates closely linked with the 
corresponding level of official statistics. 


The Committee divided the sources of income under the 
following heads :— 
1—working force 
2—agriculture 
3—animal husbandry 
4—forestry 
5—fishery 
6—mining 
7—factory establishments 
8—small enterprises 
9—organised banking and insurance 
10—other commerce and transport 
| 11—professions and liberal arts and. domestic service 
12—public authorities 
13—balance of payments and net income from: abroad. 
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Working force—The working force can be estimated by two 
methods. In the first method the figures of population census 
and its occupational classification are used and in the second 
statistics of employment in various industries and occupations are 
estimated on the basis of surveys or information flowing under 
various legislations. At the time of the final report, the 1951 
census figures were available to the Committee and many surveys 
had also been conducted to find out the Employment figures in 
various occupations and industries and -all these statistics were 
utilised in the estimation of working force in the final report. 
However, there were conceptual difficulties faced by the Committee. 


Income from agriculture—The inventory method was used 
in the estimation of income from agriculture. In the first report 
the Committee estimated the area of land under various crops 
and their yields and multiplied these figures with available 
statistics of harvest price to get the value of gross output. From 
the gross output a deduction of 21 per cent was made for total 
cost of production to arrive at the figure of net output. ‘In the 
final report the same technique was adopted with a few changes 
and adjustments which were made on the basis of new information 
then available. At the time of 1948-49 estimates of national 
income statistics of land utilisation were not available for the 
entire country. The Committee had to rely on some estimates 
of Ministry of Agriculture and also on unpublished market reports 
prepared by the Agricultural Marketing Adviser to the Govern- 
ment of India. In 1954 in the final report, the Committee relied 
mostly on the estimates of Ministry of Food and Agriculture and 
the use of figures based on market reports was abandoned. Some 
data were also obtained from the Indian Agricultural Research 
Institute and Agricultural colleges. In the estimates of the final 
report, the agricultural produce was valued at average wholesale 
prices during the harvesting period. These prices were generally 
market prices of crops after cleaning, husking etc. The value of 
agricultural output was thus taken as including these ancillary 
services. This was not a correct procedure, but due to lack of 
proper statistics, there was no other way out. In 1948-49, the 


data regarding cost of agricultural production were very scanty. 


Therefore information regarding seeds, wastage, market charges, 
measures, repairs and depreciation of implements and feed of 


=_= 
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live-stock used on farms were obtained from Ministry of Agriculture 
and from Market reports. The reports studies and enquiries 
prepared by Gokhale Institute of Politics and Economics, Poona 
and Punjab Board of Economic Enquiry were also utilised for 
calculating cost of repairs and depreciation of implements etc. 


Income from Animal Husbandry, Forestry and Fisheries— 
Tt is very difficult to determine the income from live-stock products 
in the country. The National Incomé Committee mainly relied 
upon the Livestock Census of 1945 in both, the first as well as 
final estimates. In the final report, the Committee also utilised 
the information contained in the Poona Reports of National 
Sample Survey and Professor Gadgil’s “Economic Effects of 
Irrigation". However the Committee stated that the data on 
livestock products were highly satisfactory and their estimates 
were based an guesses and opinions. 


As regards estimates of income from forests, the main reliance 
was placed on "Indian Forests Statistics" 1949 published by the 
Ministry of Food and Agriculture. There were difficulties about 
estimation of the value of products for those areas for which 
statistics were not available and in such cases average values had 
to be used as there was no alternative. 


For estimating the income from fisheries, the statistical 
material available was not only meagre but also very unreliable, 


hence reliance had to be placed mostly on Marketing Reports 
on Fish. 


Income from mining—The Committee did not find it difficult 
to estimate gross value of output in the mineral sector as ample 
statistics were available in the Report of Chief Inspector of Mines 
and surveys conducted by the Geological Survey of India. Figures 
were also collected by the Indian Bureau of Mines, the Coal 
Commissioner, the Salt Controller, the Petroleum Division and the 
Department of Commercial Intelligence and Statistics. The 
Committee mostly used the data collected by Geological Survey 
of India. 


Income from Factories establishments—There was funda- 
mental difference in the estimates of income of this sector in the 
first and final report of the Committee. In the first report the 
industrial sector was classified in two broad categories namely 
factory industries and small enterprises including handicrafts. 

L s.—4 
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For factory industries the figures of the numbers of the persons 
employed were obtained on the basis of factory statistics with 
minor adjustment for lack of coverage. For small enterprises 
figures of employment were obtained by making deductions from 
the overall total of persons employed in industries. This was an 
unsatisfactory procedure. For value of output, the reliance was 
put on Census of Manufactures. The Census of Manufactures 
related to 29 industries only (out of 63), for the rest estimates 
were made. In 1954 when the final estimates were made by the 
Committee, various changes were made in the estimation of 
income. Output figures of Sample Survey of Manufacturing 
Industries were relied upon instead of Census of Manufactures. 
The average incomes in small enterprises were estimated on the 
basis of case studies. 


Income from Trade and Transport—In the first report the 
workers coming in these categories were divided for the purpose 
of income estimation in a number of groups like communication, 
railways, organised banking and insurance and other commerce 
and transport; the last group being the biggest one. For this, 
the value of net output was derived from Government budgets 
and for railways from the statistics published by the Railway Board. 
For banks, statistics published by the Reserve Bank of India were 
used. In the case of insurance companies the data available 
in the Insurance Year Book were used. The Committee faced 
а lot of difficulties in the estimation of net value added by ‘other 
Commerce and Transport because no data were available. The 
earnings of the workers in this sector were estimated on a very 
rough and ready basis. In the final report many changes were 
introduced in estimating the income from this section. Total 
number of persons engaged in trade was taken from the revised 
estimates of working force, based on the population census of 
1951. They were split up into three categories—employers, 
independent workers and employees. Figures of the average 
earnings of employers, independent workers and employees were 
collected from a large number of sources. 


Income from professional and liberal arts—The estimation of 
income of this sector presented considerable difficulty in both the 
first and final estimates. In the first estimates, estimated average 
earnings were taken into account. The figure thus arrived at 
was compared with the estimated value of net output of this 
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sector. In the final report also more or less this very method 
was folowed, but the occupational distribution figures of 1951 
population census were used wherever possible. 


Net value of Government Services—The net value contributed 
by Government services is of two types namely from general 
administration and from business enterprises. The incomes from 
Government enterprises were treated like incomes of industrial 
sector both in the first and final report. As regards net output of 
Government administration, it was taken as equal to wages and 
salaries paid by all Government departments. Such data were 
available in the Central and State budgets and from the accounts 
of Municipalities and District Boards etc. 


Income from house property—All house property in the 
country was divided into two classes—urban and rural. For urban 
property the rental income was estimated on the basis of municipal 
rates and house tax and in case of rural house property the number 
of houses was extrapolated from 1941 census tables and their 
average value was estimated on the basis of very scanty data. 
6 per cent of the value thus arrived was taken as their annual 
rental. income. 


Net income from abroad—There are two sets of data available 
in this regard; one in the D. С. C. I. S. S. Annual Accounts 
relating to the Sea-Borne Trade and Navigation in India and 
second in Reserve Bank of India's statistics of Balance of Payments, 
published in R. B. I. 'S Bulletin and Report on Currency and 


Finance. 


Estimates of the First and the Final Reports—The 1948-49 
estimates were published originally in the First Report. Later, 
they were brought out in the Final Report also. The following 
are the estimates :— 


52 AN INTRODUCTION TO MODERN sTÁTISTICS 


National Income By Industrial origin for 1948-49 as given 
in the First Report and in the Final Report in 
Rs. abja (abja —100 crores) 
———————ÁÓ—M— ——— — ни 
net output 


PRADA анн ан v o bs ad 
First Report Final Report 
(1) (2) (3) 


agriculture 
1. agriculture, animal husbandry and ancil- 


lary activities 40.7 41.6 
2. forestry a 2 an 0.6 0.6 
3. fishery n A 0.2 0.3 
4. total of agriculture Ue 1k 41.5 42.5 
mining, manufacturing and hand-trades 
5. mining dc i ais 0.6 0.6 
6. factory establishments e Уя 5.8 nb 
7. small enterprises T3 n 8.6 8.7 
8. total of mining, manufacturing and hand- 
trades Ae 4 es 15.0 148 
commerce, transport and communication 
communication (post, telegraph and 
telephone) dc s 03 0.3 
10. railways ee an ee 2.0 1.7 
11. organised banking and insurance ats 0.5 0.5 
12. other commerce and transport ne 14.2 13.5 
13. total of commerce, transport and com- 
munications ws ad 17.0 16.0 
other services 
14. professions and liberal arts d 3.2 4.3 
15. government services (administration) .. 46 40 
16. domestic service "n E 15 12 
17. house property .. d. UR 45 3.9 
' = 
18. total of other services e m 13.8 13.4 
19. met domestic product at factor сої — .. 87.3 86.7 
20. net earned income from abroad >.. —02 —0.2 
21. met national output at factor cost—=na- 
tional income BE 2 87.1 86.5 


Difficulties of National Income Estimation in India, In 
India the estimation of national income presents two sets of 
problems—conceptual and statistical. 


(a) Conceptual problems—These problems arise from the 
typical under-developed nature of the Indian economy. They 
are :— 
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(i) The presence of a large unorganised and non-monetised 
sector creates many problems. When estimating the value of out- 
put, one normally proceeds on the assumption that bulk of the 
commodities and services produced in the country are exchanged 
for money. In our country, a considerable portion of output does 
not come into the market at all, being either consumed by the 
producers themselves or bartered for other commodities and 
services. The proposition of output which does not come under 
the influence of money differs from one sector to the other and is 
perhaps, greatest in agriculture, the most important sector of our 
economy. ‘This presents the problem of imputation of value to the 
portion not exchanged with money. This increases the element of 
guesswork in our estimates. 'The National Income Committee 
suggested that a classification of ‘monetary’ and ‘Non-monetary’ 
sectors might be included in the national income estimates of India. 
It has not been possible so far to implement this suggestion. 


(ii) The other difficulty arises from the fact that a good many 
of Indian producers are not sure of the quantity and the value of 
their products. In western countries economic statistics are largely 
collected directly from the individuals and enterprises who are the 
active economic agents. This is not applicable to Indian conditions. 
Besides the huge resources in money and personnel it may require, 
the illiteracy of the majority of the population, the semi-subsistence 
character of their economic activity and the general absence of 
keeping accounts either among producers or among consumers make 
it impossible in India. The Producers do not have any clear idea 
of the cost and hence of the net value of their product. Therefore, 
an element of guesswork inevitably enters into the assessment of 
output, specially in the large sectors of the economy which are 
dominated by the small producers. 


Gii) The comparative lack of distinct differentiation of 
economic functioning constitutes a formidable obstacle for national 
income estimation in India. In advanced countries the develop- 
ment of economy has resulted in a clear classification of groups 
into distinct categories. This facilitates the work of getting 
information required for estimating national income. In India 
too such differentiation has taken place; but its scope is very 
limited and a major segment of the economy consisting of household 
enterprises, simultaneously and without differentiation performing 
functions which would normally fall under different industrial 
categories. Thus a major sector among agricultural producers 
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pursue other occupation in other industries (large or small) often 
in urban places or at any rate outside their domicile. This over- 
lapping creates difficulties in the estimation of national income. 


(b) Statistical problems—There is also the problem of non- 
availability or poor reliability of statistical data for the estimation 
of ndtional income and related accounts. The problems in this 
connection are :— 


(i) In the field of agriculture including livestock forestry etc. 
data оп gross volume of production of some items are not available. 
А good element of guesswork enters into these estimates. The 
data regarding prices and expenses of agriculture and related 
activities are quite incomplete. 


(ii) In case of factory establishments data on output, costs 
etc. are available only for the big units. The small units which 
are more in number than big units fail to supply any data at all. 
Thus for a big part of the industrial output, we have to depend 
upon guesswork. A large number of units fail to supply data 
regarding information of payments to employees. 


(ii) In case of banking, data are available for the organised 
sector, but the unorganised sector consisting of money lenders and 
indigenous bankers fails to supply any data. The unorganised 
sectors of transport and trade also presents the same difficulty. 


i (iv) Information on government activities, while detailed, is 
quite diverse and is not easily reducible to economic categories. 


х (у) Regional diversities in India are large and therefore the 
inadequacy of data cannot easily be overcome by extending data 
for one part of the country to the rest of the country. 


(vi) Practically no reliable information is available for the 
services sector of the economy. 


(vii) Besides these lacuna which directly affect the reliability 
and accuracy of national income estimates, we lack other important 
data required either for verifying the estimates or for interpreting 
and analysing them for policy purposes. There are hardly any 
current data on the basic industry of the country, that is, agriculture} 
on the structure of costs ; on consumer expenditure of the popula- 
tion attached to land ; or on their savings if any. There are no 
data on consumption expenditure or savings of urban population 
too. There are no useful data on distribution of income also. 
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Mr. A. Rudra! has divided the statistical material used in the 
estimation of national income in India into three categories—infor- 
mative statistics, derived statistics and baseless statistics. 
Informative statistics are based on direct data collection ; derived 
statistics are based on other statistics and baseless statistics are those 
which have no objective basis at all. The majority of statistics 
used in national income estimates belong to the categories of 
derived and baseless statistics. The reliability of such statistics is 
very low, therefore the estimates cannot be reliable. Dr. K. N. Raj* 
also criticised the reliability of official national income estimates in 
the following words: "In the light of all these limitation of data, 
one might wonder whether anything at all can be said with con- 
fidence, and with reasonable precision about the growth of 
national income in the last decade and the relative rates of growth 
of different sectors. If one chose to be fastidious and critical, it 
would indeed not be difficult to agree that the official national 
income data from which many conclusions are often unsuspectingly 
«drawn, are not worth the paper on which they are printed." 


Besides the low level of reliability national income estimates of 
India, there are certain serious inadequacies also. The distribution 
of national income by factor shares, by economic groups, by sizes, 
Ъу non-monetised sectors, by type of use (personal and government 
consumption, capital formation, stocks etc.) and by regions i.e., 
states are yet to be undertaken. Some useful work has been started 
in these directions in recent years yet much remains to be done. 


Suggestions for improvement. The National Income 
‘Committee in its final report gave some useful suggestions for the 
improvement of national income statistics. Some of these are 
given below in brief :— 

1—In regard to improvement of area statistics, immediate steps 
should be taken firstly to survey and map the unsurveyed areas and 
‘secondly in the non-reporting areas and in areas where no satis- 
factory reporting agency exists, such agency be established. Тһе 


reporting of area statistics should be ordinarily done by the primary 
agency responsible for the administration of land revenue for there 
ïs no other agency which is either more extensive or which reaches 
the village level with an equally full coverage. The surveying and 
mapping of the unsurveyed areas might take a long time. As an 


а sel ен = 
1 Economic Weekly (Annual 1961) pp. 209-13. 
* Some Features of the Economic Growth of the Last Decade in India, 


pp. 209-13. 
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interim measure, it was suggested that in such areas which are 
either unsurveyed or where no satisfactory agency exists to report 
the area statistics, the statistics might be obtained on a sample basis 
with appropriate sampling procedure. 


2—In view of the burden of multifarious administrative 
duties imposed on the present reporting agency, the burden of 
statistical reporting should be reduced in a rational way. 

3—The present reporting of area statistics is not done every- 
where on a uniform basis. The method of accounting for mixed 
crops shows a considerable variation of practice, so also the areas 
under the bunds and boundaries of fields are accounted for. Area 
under à number of vegetable and fruit crops are also not available 
on a satisfactory basis. It would be desirable to show them 


separately for each vegetable and fruit crop where such crops are 
important. 


4 Аз regards estimation of yield, the traditional method should 
be replaced by crop-cutting experiments. 


5—The aim of the yield estimation should be to provide 
estimates of agricultural production at least down to the district 
level within a reasonable margin of error. 


6—The statistics of agricultural prices are in a chaotic state. 
There are any number of price series, all haphazardly collected and 
there are divergences in the prices collected by different agencies 
for the same product within the same area. For agricultural 
produce the best approach would be to classify markets rather than 
prices, i.e., instead of talking in terms of producers’ prices, traders” 


prices, harvest prices etc. it would be best to talk of prices at 
certain types of markets. 


7—To be useful, price statistics, as different from many other 
statistics, need a more frequent collection and more prompt 
publication. Apart from their use in national income computation, 
the price statistics where properly collected and promptly examined 


are also likely to be useful guides for day to day policy in various 
fields. 


8—Data regarding cost of cultivation are extremely meagre 
in India. It would be advantageous to obtain such data from 
farm management or cost of cultivation studies undertaken on a 
small but intensive scale by academic bodies and research 
institutions. 
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9—The existing information on livestock numbers and products 
is very unsatisfactory. For improvement of data relating to the 
livestock sector, we have to improve both the quality of information 
on number and yield rates, and the periodicity of the availability 
of information. It was recommended that the present quinquennial 
census of livestock should be converted into annual partial census 
with full coverage to be attained in five years. Estimates of yields 
of products like milk, wool and eggs could be made on the basis of 
the relevant livestock numbers, if satisfactory yield rates per animal 
type are established. Similarly the estimates of products like bones, 
hides, blood could be made on the basis of rates of such product 
per animal slaughtered or dead. It would be appropriate to: 
obtain information regarding these yield and other rates from 
intensive small-scale periodic studies into livestock economies by 
academic bodies. 

10—Regarding forest area, the available forest department 
figures are not very unsatisfactory. There is, however, great delay 
in release of the figures and an effort should be made to avoid this. 
A few studies on forest exploitation, if undertaken, will lead to an 
improvement. 

11—For estimating the products of fisheries, the committee 
suggested that a periodic census of persons, boats and nets engaged 
in fisheries coupled with intensive studies of their operational 
economies should be conducted. Secondly data relating to the 
processing and marketing of fish and fish products should be 
collected. 

12—The scope of Census of Manufactures should be extended 
by including all the industries and by re-defining the ‘factory’. 

13— The available data relating to employment in factories 
should be improved and brought out on a regular and continuous 
basis. 


14— The official annual estimate of employment in mining 
industries is considerably inaccurate hence efforts should be made 


to correct it. 


15—The whole field of small-scale industry is inadequately 
covered or almost not covered at all. In view of the importance of 
this sector in the national economy, it is necessary that systematic 
attempts be made to collect information relating to it. The State 
Bureaus should pay particular attention to this field specially in: 
relation to employment, earnings, cost structure etc. 
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16— The trading sector provides very scanty or no information. 
Data relating to number of persons engaged and their earnings 
should be collected. Аз regards trade margins and distributive 
costs, structural studies should be undertaken. 


17—As regards transport, data regarding employment and 
income are available only in regard to railways. Similar data are 
not available even in regard to road transport. Hence it was 
recommended that (i) an annual report should be prepared by the 
Central Ministry of Transport bringing together all available 
statistical data regarding State transport undertakings and (ii) 
structural studies should be undertaken in respect of all forms of 
urban and rural transport. 


18—Wage and salary statistics of various classes of income 
earners should be brought together. These should include all 
constituents of employee compensations such as provident fund, 
pension and social security contribution. Surveys should be made 
t0 ascertain payments made to form labour, domestic servants etc. 


19—Regarding incomes of self-employees, there are various 
difficulties. Ап attempt may be made to compile statistics in this 
regard from every possible source. 


20—Special studies should be undertaken to ascertain data 
on personal incomes and rentals of houses and property. 


21— The collection of income-tax data needs to be improved. 


22— + is suggested that there should be a strong co-ordination 
between activities of the NIU and the NSS. The latter should 
undertake collection of consumer expenditure data and data 
relating to capital formation in rural areas. For NSS be 
strengthened by appointing the Research Programme Committee. 


Many of these suggestions have been implemented and much 
improvement has taken place in the quantity and quality of 
statistical information. The National Income Unit, which functions 
as a branch of CSO publishes national income estimates every 
year. In recent years, a lot of work has been done in the research 
of national income. Following note-worthy developments have 
taken place in this field :— 

1—In 1957, the C.S.O. established an “Indian Conference on 
Research in National Income" with which official and non- 
official experts are associated. They have discussed the following 
aspects of national income :— 
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(i) Industrial breakup of national income. 

(ii) Growth of national income. 

(iii) Private Consumption 

(iv) Regional income. 

(v) Distribution of national income according to size. 

2—There has been considerable research in the field of 

national income by Central and State Governments, Research 
Institutes, Universities etc. The NIU has published a number of 
useful papers on national income estimates. 

3— The C.S.O. has conducted studies in connection with Social 
Accounting method. It has also published “Estimates of gross 
capital formation in India 1948-49 to 1960-61." 

4— The NSS has made an independent study regarding sectoral 
estimates of national income. 

5—'The State Bureaus of Statistics also estimate State incomes 
for their respective state. 


СНАРТЕЕ 5 


POPULATION CENSUS STATISTICS 


A census of population may be defined as “the simultaneous 
recording of demographic data by the government at a particular 
time, pertaining to all the persons who live in a particular 
territory.” There are two main objects of a census, one static 
and the other dynamic. In the static sense, it provides a 
photagraph of a community and is valid for a particular point 
of time. In the dynamic sense, it is an item on the consecutive 
series and as such is comparable to a motion picture of 
population—giving the magnitude and direction of a demographic 
trends. 


Chief Characteristics. According to the definition given 
above, the chief characteristics of a population census are :— 


1—Universality—A population census should be universal. 
It must include every person who is a citizen of the country. No 
person is to be left out. 

2—Simultaneous Count—This characteristic of census-taking 
is of great importance. As far as possible the entire set of figures 
collected at the time of the population census refer to a particular 
point of time. A time lag in these figures for different areas 
would render them unfit for comparison. 


3—А precisely defined territory—A population census refers 
to a very clearly defined geographical or political territory. Any 
changes in area in successive censuses must be clearly mentioned. 

4—Regularity—1f census is not conducted at regular intervals 
the data lose a lot of their value. In order to study demographic 
trends, it is necessary that census should be held regularly. 
Moreover, a regular schedule of census taking if adopted by all 
‚ countries, would greatly facilitate international comparisons. 
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5— Collection of personal and individual data—In а popula- 
tion census facts are collected about the individual. Each person 
has to give information about himself and his family. The scope 
and nature of questions depends upon the cultural level of the 
said population. 

6—Government sponsorship—Since census requires a. vast 
organisation and money, this can be properly handed by the 
Government. Moreover, Government compel the citizens to 
furnish the required information. 

Utility of Population Statistics. Population statistics are 
the oldest of all statistics collected by nations. The original 
objective of population census used to be the assessment of 
military and taxable capacity of the population. Gradually, the 
scope of census has been expanding and now census has more 
social and economic utility than any other. Though primarily 
meant for administrative purposes, census also gives much 
valuable information to economist, sociologist and the businessman. 


On the basis of census figures, the economist can study the 
population trend of the country, its occupational structure and 
trends of urbanisation. With the help of other relevant data, 
he can trace the correlation between population growth and food 
supply, between occupational changes and its effect of granting 
protection to industries and between increment in urban population 
and decay of rural crafts. The population statistics focus a lot 
of attention on various socio-economic problems of the nation 
and help the government in the formulation of policies. In these 
days of planned economic growth and material advancement, 
such statistics become all the more important, because they provide 
detailed socio-economic facts about the population. As Mr. G. L. 
Mehta puts it, "The tackling of the various issues involved in 
the growth and shift of population, distribution of national 
income, disparities in the condition of living and the problem of 
poverty, unemployment etc needs a factual background. No 
sociological study or economic planning is possible without an 
adequate comprehension of the occupational pattern of the 
country.” Such a factual background is provided by population 
statistics. 

"The sociologist may study the possibility of effecting reforms 
in respect of, say for instance, ages at which people should marry 
or arrangements that should be made to bring down the infantile 
mortality. Various questions, such as unemployment, defence, 
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migration, family planning etc. can be properly studied with the 
help of the census data. Social insurance scheme or rural welfare 
scheme depends upon census facts. 


Businessmen generally do not realise that the census reports 
contain information which is very important for them. If they 
realise, many of their problems can very well be attended. India 
has a large volume of internal trade and every human being 
recorded in the census is a customer. For a businessman, 
knowledge of his consumers and their location is very helpful. 
Estimates of the areas, where development of markets is likely, 
can be made with the knowledge of the density of population of 
different areas. The class for which his goods are specially meant, 
say infants, ladies, military people—can be approached in a 
business-like manner on the basis of the occupational statistics. 
He can guess the present and future labour supply and can make 
adjustments in the production scheme accordingly. 


Census report will give many useful information to a transport 
agency like railway. Densely populated area or would be so 
populated if only the means of transport were improved or 
introduced, should receive the first attention of the transport 
authority. For advertising agencies, such areas would also be good 
to push their advertisement. If the population of a town falls, 
demand in that area will fall unless the demand of the existing 
population rise proportionately. Similarly changes in age com- 
position, sex-ratio, occupational structure are likely to affect the 
demand for goods. Life Insurance Companies may compare 
their estimates of expectation of life with those published in the 
census reports. Legislatures can study the necessity of framing 
legislative provision for removing the ills from which society suffers, 
from certain facts brought into light by census. A Census is of 
great use to both the people and the State also. Electoral 
readjustments are made on the basis of census reports. 


Methods of Conducting Population Census. Broadly 
speaking there are two methods of census-taking. They are : 

(i) De facto method and (ii De jure method. ` 

(i) De facto method—Under de facto method of census- 
taking all persons who are present in the territorial jurisdiction 
of that country on the day or night to which the census figures 
relate are counted. АП persons living in the country are counted 
simultaneously wherever they are found on the census night. 
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As enumeration has to be done simultaneously in one night or on 
a particular date, it is also called *one night enumeration method' 
or simply ‘date system,’ 

(ii) De jure method—Under de jure method of census taking 
population is counted or the basis of its habitual residence and 
not where it happens to be at the time of counting. People are 
counted on the basis of their normal residence. This method is 
also known as “periodical system of enumeration.” 


POPULATION CENSUS IN INDIA 


The first systematic attempt to record population statistics was 
made in 1872 in our country. This census of 1872 was neither 
uniform nor it was extended to the whole of the country. After 
1872 the next census was taken in 1881, and since then it has 
become a regular decennial feature, the latest one having been 
completed in March 1961. For our study of population census 
in the country, we divide the various censuses into four parts— 
(i) upto the census of 1931 (ii) the census of 1941 (iii) the census 
of 1951 and (iv) the census of 1961. 


Census upto 1931 


The 1931 census was the 7th census of the country including 
that of 1872. We shall study the procedure followed upto 1931 
census under following heads : 

(i) Census Act—In our country population census was 
considered a decennial operation for which haphazard temporary 
arrangements used to be made. Two or three years before a 
census was scheduled to take place, an Act was passed. Such 
an Act empowered the Central Government to appoint a Census 
Commissioner at the top of the census organization and 
Superintendents of Census Operations in every province. The 
Act also authorised the Government to call for information from 
individuals. The Act contained a penal clause, providing for 
punishment in the event of persons not supplying the information 
or giving out wrong information and for any obstruction and 
restraint in the discharge of the duties of the census officials. 
It also contained that the information supplied by people is to 
be treated strictly confidential and would not be used against 
persons оп any occasion. 

(ii) Staff—Population census: operation staff consisted of the 
Census Commissioner at the top, Provincial Census. Superintendents, 
in charge of various provinces, District Census) Officers, in charge of 
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various districts. "A district was further sub-divided into charges 
and each charge was under a charge Superintendent. А charge 
was usually a sub-division of a district corresponding to a 
municipal area in urban towns and a tehsil in a rural area. The 
Executive Officer or the Secretary of the Municipal Board and 
the Tehsildar or the Naib Tehsildar of the Tehsil was usually the 
Charge Superintendent. The charges were sub-divided into 
circles in charge of circle supervisors. Circles were further sub- 
divided into blocks in charge of’ Block Enumerators. Thus we 
see that the official census organisation is of a pyramidial nature. 
The entire staff needed for the job was taken from the various 
ranks of government servants. The entire staff was to work on 
honorary basis. The nature of organisation is still on the same 
lines. 


(iii) Training of staff—After the appointment of staff, 
training in the census procedure was given to them. The training 
consisted of two parts, theoretical and practical. The staff above 
the enumerators was first supplied with census manuals which gave 
-detailed information and instructions as regards the census 
procedure and the duties of the various officials. Officials were 
also given enough practice to fill in the various forms and schedules. 


(iv) Procedure for collection of data—The actual census 
work began with the numbering of houses. This work was done 
much before the actual census date. The: concept of a house 
was related to “Chulha” ( чеп ) and members of а family taking 
food from а common cooking place were regarded as belonging to а 
common house. After numbering the houses, the preliminary census 
took place: This was done a few days before the actual day of census. 
The block enumerators filled in the schedules which were later 
on checked by the supervisors. On the actual census night, the 
preliminary record was made uptodate. Name of persons who 
had left houses or had died were struck off from the list and those 
who had come from outside or were born were entered in the list. 
Special arrangements were made for counting those who were 
travelling by ‘rail or boats or were working in forests etc. By 
'6 am. of the day after the census night the schedules were 
completed by the Block Enumerators and passed on to the 
Supervisors who in turn passed them on to the charge Superinten- 
dents and through them to the Provincial Census Superintendents 

-and ultimately to the Census Commissioner for India. The 
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Commissioner compiled and published these figures. Later on 


the: Census Reports were published and then the official 
organisation of the census was wound up. 


Census of 1941. The census of 1941 was the 8th census of 
the country. It was held in abnormal times, as the Second World 
war had already begun in the year 1939. In the census operations 
of 1941 certain changes were introduced. Improved technique 
was adopted in the compilation and analysis of data. Тһе 
important changes that were introduced in 1941 census may be 
divided into two groups, ie., changes in procedure and changes 
in information collected. 


(A) Changes in Procedure 


(i) The practice of one-night census was discontinued—The 
most important change in the census of 1941 was that one-night 
census was discontinued and for the first time population was held 
on the basis of normal residence. In other words the de-facto 
method was replaced by de-jure methods. This change was 
introduced because one-night census method had many 
complications. Firstly, there was the difficulty of selecting the 
census night. It should be a moonlit night and on that night there 
should not be any unusual event like a fair etc. keeping many 
people out of their homes. Secondly, the one-night census required 
a very large number of enumerators. Thirdly it was not possible 
to verify the accuracy of the figures reported by the heads of the 
families or entered by the enumerators in their schedules. In the 
new method which was adopted in the 1941 census, people were 
counted on the basis of their normal residence. Thus it fixed a 
period of enumeration, and if during this period a person absented 
from his normal residence, he was counted at the place of his 
normal residence. The period fixed for 1941 census was one 
week. | 

2— Introduction of slips—This was another important change. 
The old schedules were abolished and the enumeration was 
conducted directly on slip. One slip was assigned to each indi- 
vidual on which all information relating to him was noted down. 
Previous to 1941 census, data used to be copied from the schedules 
to the slips and then tabulation was done. This procedure was 
very cumbersome and led to duplication of work. Under slip 
system, tabulation was done from the slips directly. i 


т. 5—5 
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3—Use of mechanical calculators—In the 1941 census tabula- 
ting and calculating machines were used for the first time. This 
made the task of analysing the data very easy and quick. 


4—Random-checking—ln the census of 1941 two per cent 
random sample of all the slips was taken out for verification of 
census data at a later date. Every 50th slip (known as Y sample) 
was taken out and kept separately. However sample could not be 
analysed due to war. These slips were later on analysed by the 
Indian Statistical Institute, Calcutta, when the National Income 
Committee wanted to have certain information in the year 1949. 

5—Extension of the house-list—Another change introduced in 
the census of 1941 was the extension of the house-list. Upto the 
census of 1931 house-list was based on the list of houses. In the 
census of 1941 the house-list was made more detailed and provided 
all sorts of information regarding the average size of a family, the 
sex ratio and distribution of persons in houses according to age 
groups: 

6—Complete centralisation of printing—All the printing work 
was centralised at one place. It resulted in efficiency and unifor- 
mity in printing. The experiment was a success. 

7—Use of symbols—Answers to various questions were 
recorded by the use of symbols, This practice was adopted with a 
view to make the information precise and mechanical tabulation 
easier. 


(B) Changes in Information Collected 


Some. changes were also introduced in the information 
collected, They were :— 
| Ва, of population. growth :—Two new questions were 
introduced to study the rate of population growth in the country. 
These questions were (a) the number of children born to a woman 
and (b) age at the time of first child birth. 

2—Revision of occupational classification :—The occupational 
classification was revised. and made more scientific and realistic in 
1941 census. But due to difficulties created by war this informa- 
tion could not be tabulated.. However, the Indian Statistical 
Institute, Calcutta used the sample slips and gave some rough idea 
about occupational structure of population. 

3—Separate figures were collected about persons who could 
read but not write for the first time. ЕЛЕС 
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POPULATION CENSUS OF 1951 


The 1951 census has a special significance because it was the 
first census of free India. It extended to the whole of the country 
excluding the state of Jammu and Kashmir. Information was also 
collected on the economic characteristics of the population. Many 
changes of far-reaching nature were introduced in the technique 
of holding the census. Information was also collected about a 
large number of problems. 


Special Features of 1951 Census. Some of the important 
features of 1951 census were as follows :— 


(i) Permanent Census Act was passed: :— Till 1941. census, 
an Act used to be passed just a year or two before the commence- 
ment of the census. But the Census Act which was passed in 
1949 for the census of 1951 was made a permanent statute. 


(ii) Permanent Organisation was established :—The office of 
the’ Registrar General and Census Commissioner was creatéd on 
а permanent basis. Till the census of 1941 there was no permanent 
machinery for conducting census. The office’ of the Registrar 
General and Census Commissioner is situated at Delhi The 
office is charged with the responsibility of holding decennial census 
and also of estimating population for every year of the intercensual 
period on the basis of statistics of births and deaths. The office 
also conducts surveys for estimating fertility and reproduction 
rates. 

(iii) Preparation of a National Register of Citizens :—Another 
new innovation of the census of 1951 was the compilation of a 
National Register of Citizens. From the census slips information 
about individuals is recorded on the register. The register is 
divided into numerous volumes kept on geographical basis. There 
is a register for every village or a ward of a town. | These registers 
are arranged first district-wise and then state-wise. The Registrar- 
General is expected to keep the register uptodate by recording 
births, deaths and migrations. The Register is a complete account 
of the people in the country. It is regarded a secret document and 
cannot be produced in a court of law as evidence against any 
person. The researchers are permitted to consult the register for 
. making surveys. This register served a very useful purpose as it 
i contained a large variety of information. It served as a frame- 
vork for social. and economic surveys conducted on random 
n iple basis. : 
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(iv) Distinction. between house and household :—For the 
first time in 1951 census a distinction was made between ‘house’ 
and ‘household’. Before this, census used to be conducted on the 
basis of houses and not household. A house’ was defined as a 
dwelling place with a separate main entrance, while the concept 
of a household was concerned with the ‘Chulha’ and persons 
dining from a common kitchen were regarded as members of the 
same household. This definition was very helpful for studying the 
size of families and the social question of the breakup of joint 
family. 

(v) Preparation of District Handbooks :—District Handbooks 
were prepared for the first time. This brought the data in old 
District Gazetteers up-to-date. 


(vi) Classification of population :—Population was classified 
on the basis of (a) dependency and (b) employment. 


(vii) Sample verification of census count was done. 


(viii) An account was prepared of every village in India, 
containing information about its area, population, number of 
literates and distribution of population in livelihood classes. 


Changes in information collected. А large variety of new 
information was collected in the census of 1951. This census laid 
greater emphasis on data relating to economic ‘characteristics of 
population. As the country had chosen the path of planned 
economic development it was necessary to collect exhaustive infor- 
mation about the occupational pattern and secondary means of 
livelihood and problems of employment, dependency etc. An 
attempt was made for the first time to have ап idea about the 
economically active population of the country. Information was 
collected on the following points for the first time :— 

1L—The 1951 census recorded the name of the informant and 
his relation to the head of the household. This question was 
included just to study the breakup of joint family system. 


2—Under the question relating to civil condition information 
was also collected about divorced people. Formerly marital 
status of the people was classified only in three categories namely 
married, unmarried and widow, but the changed social structure 
necessitated the creation of this fourth category. 

3—Information was also collected about displaced persons. 
This information was collected for estimating the extent of refugee 
problem created due to partition of the country. Such persons 
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were required to give information about the date of their arrival 
in India and their district of origin in Pakistan. 
4—Information about race, caste and tribes was not collected 
in the census of 1951, because the Government did not want to 
encourage sectionalism on the basis of caste and races. However, 
data were collected about Special groups and backward classes. 
Tt is for the reason that the Constitution gave certain privileges to 
such classes. 
5— The question of language was made more comprehensive. 
6—The population was divided into following groups on the 
basis of means of livelihood :— 
A-Agricultural Class :— 
(i) persons doing agriculture on their own land. 
(ii) persons doing agriculture on the land owned by others. 
(ii) Agricultural labourers. 
(iv) Owners of land not doing agriculture. 
B-Non-agricultural class :— 
(i) persons engaged in productive work other than 
agriculture. 
(ii) persons engaged in trade. 
(iii) persons engaged in transport. 
(iv) persons engaged in other occupations and services. 
7——The information was also collected regarding economic 
status of the people under following heads :— 
(1) Self-supporting, 
(1) non-earning dependent, 
(iii) earning dependent. 
Information collected in 1951 Census, The following 
questions appeared in the enumeration slip of 1951 :— 
1—Name and Relationship to Head of Household. 
2—Nationality, Religion and Special Groups. 
3—Civil Conditions. 
4—Age. 


5—Birth Place. vhi ja 
6—Date of arrival of displaced persons and their district of 


origin in Pakistan. 
7—Mother tongue. 
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8—Bilingualism. 

9—Dependency and Employment, 
10—Principal means of livelihood. 
11—Secondary means of livelihood. 
12—Literacy and education. 
13—Unemployment. 

14—Sex. 


POPULATION CENSUS OF 1961 


Census of 1961 was the second taken after the independence. 
The 1951 census coincided with the beginning of the first five-year 
plan and the Census of 1961 coincided with beginning of the third 
five year plan. This coincidence has a great importance because 
population statistics in general and statistics relating to occupational 
cistribution of population and the employment status of the 
economically active population in particular throw light on the 
changing pattern of the Indian economy. Such data also help in 
assessing the economic progress in terms of national income and 
level of employment. The 1961 census was the first comprehensive 
census of the country, as it covered the area previously left. As the 
office of the Registrar General and Census Commissioner was 
already established on permanent footing at the time of 1951 
census, preparations for the census had begun long before the actual 
census operations started and many Conferences and Committees 
had discussed the details of the census work. 

Census Operations. The office of the Registrar General and 
Census Commissioner first took the decision about the data to be 
collected in the census of 1961, and the methods to be adopted for 
the purpose. Concepts and definitions of various terms which 
were to be used were finalised by it. Тһе office also chalked out 
а complete schedule according to which census operations were to 
be conducted. Having done this preliminary work, census staff 
consisting of Enumerators, Supervisers, Charge officers, Sub- 
divisional Officers, District Census Officers. and Census 
Superintendents was appointed. The period of enumeration was 
of 19 days beginning on 10th February 1961 and ending on 
28th February 1961. The census of 1961 was related to the sun- 
rise of 1st March, 1961. From 1st March 1961 to 5th March 1961 
the enumerators verified the information. On 27th March 1961 
the Census Commissioner declared the provisional totals. 
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Special Features of 1961 Census. (i) Like the Census of 
1941 and 1951 the population census of 1961 was also conducted 
on a de jure basis. In other words people were counted on the 
basis of their normal residence. 


(i) 'The distinction between house and household which was 
made in 1951 census for the first time was retained in 1961 census 
with little modifications. In 1961 census there were three categories 
viz. (a) Building (b) Census House and (c) Household. The 
term ‘Building’ referred to an entire structure raised on ground. 
The term ‘census house’ referred to a building or part of a 
building having a separate main entrance not necessarily leading 
to a road or lane. Thus one building could have a number of 
census houses provided each one had a separate entrance. A 
household was defined as a group of persons who ordinarily lived 
together and took food from a commonness. It was not necessary 
that all members of a household should be relatives. Persons 
living in hotels, hostels or hospitals therefore constituted a house- 
hold. There could be a number of census in a building and a 
census house could have a number of households. 

(iii) In the census of 1961 information was recorded on three 
forms, while in former census only two forms were used. These 
forms were :— 

(a) House List, 
(b) Household Schedule, 


(c) Individual Enumeration Slip. 


In this census the house-list was extended considerably so as to 
include a large variety of information. Information was collected 
about the purpose for which a house was used, whether it was 
used for residence or for shop or for workshop or school or for any 
other purpose. In case a house was used as a factory or workshop, 
further information about the number of persons employed, type 
of work done, kind of power used was also collected. Details 
about the house (regarding construction of its walls, roof etc) were 
also recorded. Thus it contained very useful information. 


In a household schedule information was collected about the 
households engaged in (a) cultivation, (b) household industries or 
(c) employed as labourers in either cultivation or household 
industries ог both. Such data were never collected in the past. 
Such data would throw light on the occupational pattern of the 
Indian population specially in rural areas. 
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A large variety of information was collected on individual 
enumeration slip. There was one slip for one individual. 


(iv) Some of the questions which were included in 1951 
census slip were dropped, for example the question relating to 
displaced persons was dropped, as it was not necessary then. 


(v) In 1961 census greater emphasis was laid on economic 
.data rather than mere enumeration of people. For the first time 
whole population of the country was divided into two broad 
categories of ‘Working’ and ‘Not working. The classification 
adopted in 1951 census (self-supporting, earning dependents, and 
not earning dependents) was dropped altogether. Data about 
principal and subsidiary means of livelihood which were obtained 
for the whole population in 1951 were obtained only for certain 
classes of people in 1961 as it was thought meaningless to make 
this distinction specially for rural areas. The basis of determining. 
the principal and subsidiary means was also changed. Formerly 
the criterion adopted was that of income, so that the principal 
means of livelihood was supposed to be one from which the largest 
share of income was derived. In 1961 the criterion was time 
and the principal means of livelihood was supposed to be one in 
which a person devoted major part of his working time. 


Information Collected in 1961 Census. In 1961 census 
information was collected on two different types of slips viz., 
Household Slips and Individual Slips. 


Household Slips :—The following information was recorded 
in household slips— 


1—15 the household an institution—It was to be mentioned 
whether the enumerated household was an institution like jail, 
hostel, hotel, hospital, a religious institution etc. 

2—Name of the head of the household—The head of the 
household was supposed to be one on whom fell the chief respon- 
sibility of the maintenance of the remaining family members. 
Thus the head of the household was not necessarily the eldest 
member of the family. 


3—Does the household belong to scheduled castes or tribes— 
4—Households engaged in cultivation and/or household 
industries and details of persons working in either or both culti- 
vation and household industries—This section of the household 
slip was divided in three parts namely, (a) cultivation (b) House- 
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hold industry and (с) Workers at cultivation and/or household 
industries. Under part (a) information was collected about :— 


(A) Land under cultivation by Household— 


(i) from Government or otherwise owned permanently. 
(ii) from private persons for payment in money, kind 
or share. 
(i) Total of item (i) & (ii). 
(B) Land given to private persons for cultivation for payment 
in money, kind or share. 
Figures were also collected about the total area. 
Under part (b) statistics about household industry were 
obtained. The term household industry referred to such units 
which had the following characteristics :— 


(i) that they produced some goods or performed ancillary 
services like oiling, cleaning or repairing etc. of goods produced ; 
(ii) that they were of a family size in whose activity mostly 
members of the household participated ; (iii) that they were not 
of the type of registered factories though they could use machines 
and power; (iv) that they were located at the residence of the 
proprietor if the household was in the urban area. If the 
household was in the rural area the household industry unit 
could be located anywhere in the village. Information was also 
collected about the number of months for which such industrial 
units functioned in a year. If any unit worked for all the 12 
months it was also noted down. 

In part (c) information was collected about the persons who 
worked either in cultivation of land or in household industry or 
in both. It was ascertained where the head of the household 
worked and how many other members of the household worked 
in these categories of work. Statistics relating to hired labourers 
were also collected. 

Individual Slips—In individual slips statistical information 
separately for every individual of the country was recorded. ‘These 
slips contained following questions :— 

1—(a) Name :—The name of the person to whom the slip 

‚ related was noted down. (Б) Relationship with the head of the 
household. ; 
- 9 Age on last birthday. 
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3—Marital status :—People were classified as— 


(i) Never married 
(ii) Married 
(11) Widowed 
(iv) Separated or divorced 
4—(a) Place of birth :—Information about place of birth 
was collected on the following basis :— 
(i) Born in village or town in which enumerated. 


(ii) Born in another village or town of the district in 
which enumerated. 


(iii) Born in another district of the state in which 
enumerated. 


(iv) Born in another state of India. 

(v) Born in another country. 

(vi) Born on sea, air, railways or road vehicles. 
(b) Whether born in village or in town. 
(с) Duration of residence if born elsewhere. 


5—(а) Nationality (b) Religion (c) Scheduled castes or 
"Tribes. 


6—Literacy and Education :—The following information was 
collected under this question :— 


(i) Persons who could neither read nor write or who 
could read but not write. Such persons were treated 
as illiterates, 


(ii) Persons who could both read and write. Such persons 
were treated as literates; 

(ii) Standard of Education—Record was made if the 
person to whom the slip related һай passed any 
examination. Such record was made of the highest 

` examination passed, 
7—(a) Mother tongue (b) Any other language. 
8—(a) Whether working or not working. 
(b) Activity if not working. 
9—Cultivation :—Persons engaged in cultivation on land 
from Government or private persons, either as employer, single 
worker, single worker or family worker are to bé recorded here. 
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10—Agricultural Labour :—This question was meant for 
persons engaged as agricultural labourer for wage either in cash 
or kind such as share of the produce. 


11—(a) Household Industry or Business. 
(b) Nature of Work. 
(c) Class of worker :—Whether Employer or employee 
or single worker or family worker. 


12— (а) Other work. 
(b) Industry, Business, Trade, Profession or service. 
(c) Class of worker. 
(d) Name of establishment and Address. 


13—Sex. 
SHORTCOMINGS OF INDIAN POPULATION STATISTICS 


Indian population statistics have been severely criticised for 
their shortcomings in the past. The Census Report of 1951 
stated, “It used to be likened in the past to the phoenix—the 
only bird of its kind which is reputed to complete its life cycle 
by burning itself on the funeral pile and then rise from ashes 
with renewed youth to live through another cycle^ Mr. M. W. 
Yeats observed, “The system if that word can be used here is 
in brief that every ten years one officer is appointed to conduct 
a census and officers to work under him are appointed in each 
province. The states take corresponding action. The appoint- 
ments are made at the minimum of time beforehand and within 
one year questionnaires have to be settled, the whole country 
divided into enumeration units, a heirarchy of enumeration 
officers created and trained, millions of schedules or slips printed 
and distributed over the face of the country, the whole process 
of enumeration carried out and checked, tabulation is then carried 
out in offices located in.any old place that can be found, on 
make-shift pigeon-holes and furniture and with temporary staff ; 
rushed through the press—and then, in the third year the whole 
system is wound up, the officers and the office staff are disposed 
and India makes haste to discard and forget as soon as possible 
all the experiences во painfully brought together.” Indian census 
was described as a “comet which appeared on the India horizon 
every ten years and after two or three years of activity passed 
-away unnoticed.” The shortcomings of Indian population 
statistics can be studied under two .heads—(i) Shortcomings 
regarding data collected and (ii) General shortcomings. 
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Shortcomings regarding data collected 


l—Age returns :—The concept of the word ‘age’ has not 
remained the same in various censuses. Upto the census of 1921 
age was counted as number of years completed, but this definition 
was changed in the year 1931, and age in that census was noted 
as ‘age nearest birth day. In 1951 census age was noted as on 
last birth day, that is, the actual number of years completed. 
In 1961 census too this concept was adopted. In many advanced 
countries of the world actual age is recorded in years, months and 
days and the date of birth is also noted down. Such detailed 
data are more useful for studying the age structure of the 
population and for the calculation of birth, death, fertility and 
reproduction rates. 


Age returns of Indian censuses are admittedly unsound. 
This is due to ignorance and indifference of the people. 
Generally people in the rural areas have no idea about their date 
of births. Generally persons tell their ages in figures ending with 
‘© or ‘5’. In case of unmarried girls who have reached puberty 
the age returns are definitely wrong. There is tendency on the 
part of old people and recently married girls to overestimate 
their age. Besides, there is a taboo in India on declaring publicly 
one’s age as it is believed that the age of the person doing so 
is reduced. In villages people reckon their age on the basis of 
some event like faminine or some event of historical nature, or 
the number of harvests seen by them. If the harvest fails in any 
year that year is not counted in his age. “Ladies generally under- 
estimate their age, but this is not found in our country as has been 
reported in the 1951 census report, *We do not apparently share 
one weakness which is prominently observed in some other 
countries. Our womenfolk "speaking generaly and in large 
numbers are not keen on being recorded as younger than they 
are. "Tt is gratifying to note that the enumerators have been 
instructed to take proper precaution while recording age of persons 
who are illiterates. They are to verify the ages with reference 
to particular events of importance. 


2—Civil condition or Marital status :—Civil condition figures 
are also wrong in our population statistics. Till 1941 census there 
was no classification for divorced or separated couples. This was 
included in 1951 census. In all censusés those people who lived 
as husbands and wives irrespective of the fact whether they have 
been legally married or not, were recorded as married. Till 1951 
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census prostitutes and devdasis etc. were treated as unmarried even 
though they had children. In 1961 the marital status of prostitutes 
was not necessarily taken as unmarried. 


3—Occupations :—Classification of occupations has not been 
uniform all through. It has changed from census to census. 
Hence the data are rendered uncomparable. 


4—Literacy and Education :—The concept of literate and 
illiterate has been changing in various censuses. 

5—Language :—Statistics of languages spoken by people 
have always been collected but these figures are not comparable 
from census to census as the classification of languages has not 
been uniform all through. In 1901 the number of languages 
returned was 147, in 1911 the number was 220, in 1921 it was 
222, in 1931 it was 225. 

6— Religion, caste and races :—Upto the census of 1911 the 
record of religion was optional and many provinces did not 
collect this information. For the first time in 1921 information 
about religion was collected for the whole country. But the 
classification was not uniform in all provinces. The terms caste, 
tribe and race have always been ill-defined and have created 
confusion. In 1951 census the question about caste, tribe and 
race was dropped altogether and replaced by nationality and 


special group. 
General Shortcomings- 


7— Geographical | coverage :—The statistics collected іп 
different censuses of our country are not strictly comparable, 
because they have constantly changed in its area coverage. This 
is due to changes in the political area. Formerly Burma was 
included, but it was separated. Upto 1941 census, princely states 
were not included. In 1951 princely states were included because 
of their merger with the Indian Union, but area gone to Pakistan 
was excluded. The 1951 census did not include Jammu and 
Kashmir though it was an integral part of the country. The 1961 
census included Jammu and Kashmir. There are certain parts 
in our country which are not easily accessible. Such areas were 
covered for the first time in 1961 census. 


8—Indifference of the people :—After all the accuracy of 
the census statistics depends on the replies given by the citizens. 
The quality of census figures is considerably, affected by the 
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indifferent attitude of the Indian population. The Census 
Superintendent of Central India pointed out in 1931, “Indifference 
arises from the outlook on life. The average man or woman in 
India matures early and shortlived. Life presses heavily on them 
and. fatalism overpowers them. Childhood, adolescence, middle 
life and old age are well marked) stages in life and the Hindu 
sociological system has laid down conduct of life and presented 
rules for the observance of customs and practices.” Even after 
independence the Indian census has failed to arouse that interest 
which the importance of population statistics demands. It should 
be impressed upon the people that it is of utmost importance 
to their well-being that the census returns are correctly filled. 


9—Quality of census staff :—The quality of the staff engaged 
in the census work is also a very important factor from the point 
of view | of the accuracy of census statistics. Unfortunately the 
quality of census staff in our country has not been very satisfactory. 
One of the reasons of inferior quality of census stag is that the 
Indian census is unpaid: The enumerators are not paid anything 
for the work done by them. They are only awarded certificates 
of proficiency and medals and. in some. cases a nominal hono- 
rarium is paid. This is due to the fact that the census work 
requires a huge army of workers, and payment to them will bring 
heavy burden on the Government. But we cannot ignore the 
fact that in the absence of any remuneration the staff cannot be 
expected to put his heart in the work. Another factor responsible 
for the quality of census staff is the question of training. 
Enumerators are not properly trained and. there is need for better 
trained personnel for the job. 


A few suggestions to improve the census data—1—There 
should not be change every time in the occupational classification. 
The occupational classification should be simplified and be made 
according to Indian conditions. ; 


2—The quality of  enumerators-needs improvement. As far 
as possible advantage should: be taken of the experienced persons. 
A list of persons who have done the work in a census should be 
maintained, and such persons should be appointed. University 
students of Economics, Commerce or Sociology, who offer willingly 
for the work may be appointed on a nominal remuneration. 


3—There should be a permanent organisation for the census 
work in every state, just as there is one in the centre. 
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4—There should be co-ordination between central and state 
statistical organisations and the census organisation. 


5— Моге economic facts should be:collected. 


6—For the proper study of increasing trends in population 
inter-census data be collected, and data regarding fertility etc. 
be properly analysed. 


7—There should be a citizens advisory committee like that 
of U.S.A., concerning experts on demography and questionnaire 
be prepared after consultation of the members. 


8—The sample verification done after the census should be 
made extensive. 


The condition has improved considerably and it is hoped 
that there will be much improvement in the quality of census 
data in the coming censuses. 


Important findings of 1961 Census 


The important findings of 1961 census are :— 


1—On 1st March. 1961, the total population of the country 
was 43,92,35,082, out of which 22, 62, 93, 620 were males 
and 21,29,41,462 were females. 


2. In this census, for the first time, Jammu and Kashmir 
(excluding areas occupied by Pakistan and China) was 
included. The population of Portugese occupied parts 
freed by India, Goa, Daman and Diu was included: “For 
there, the figure of census conducted by Portugese on 
15th December 1960 was taken. 


.3In all. the states of the Union, Uttar Pradesh is the highly 
populated state, having 16.81% of the total population 
of the country. Next comes Bihar, having 10.59% of the 
total population of the. country. 


4—India has 14.6% of the total population of the world. 
The area is 2.4% of the total world area. 


5— There has been growth of population at the rate of 2.1596 
per annum from the year 1951, while the rate of growth 
between 1941 and 1951 was 1.3% per annum. Rate of 
growth is highest for Punjab. 
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6—The density of population in India is 384 persons per 
square mile. It was 316 persons per square mile їп 1951. 
The highest density is of Bihar (691) which the lowest 
is of Rajasthan (152). 
7—Yhe sex ratio for 1961 is 940, while it was 947 in 1951. 
In Orissa, Kerala, and some districts of Bihar, there are 
more than 1000 women per 1000 males. 
8—During the last ten years, there has been growth at the 
rate of 0.8% per annum in the number of educated 
persons. For males the rate is 1% and that for females 
is 0.5%. Delhi stands first in the progress of education. 
Formerly it was Kerala. 
9—In 1951, there were 3957 cities with a population of 
5000 or more, having a total population of 6,22,76,729. 
In 1961 the number of such cities is 2690 and their total | 
population is 7,88,35,939. This shows that still 82% 
of the population of the country lives in villages and only 
1896 lives in. cities. 
10—There are 9.95 crores farmers and 3.15 crores agricultural - 
labourer according to 1961 census. For the 1951 census 
these figures were 7 crores and 2.80 crores respectively. — 
'This shows that still agriculture is the main occupation | 
of people in India. 
VITAL STATISTICS а 
‚ Vital statistics are; very closely related to the population - 
statistics and differ from them in the sense that while the census 
gives an account of the population at a specified time, the vital | 
statistics give an account of the movement of population over a аш 
period of time. Vital statistics refer to the statistics of births, - 
deaths, marriages, divorces, sickness and disease etc. In fact all Ы 
types of statistics which have a bearing on mortality, fertility and | 
reproduction of a population are included under vital statistics. 
Such statistics are very useful in scientific investigation of social - | 
phenomena and improvement in conditions of life. "The Labo 
Investigation Committee stressed the great importance of vital - 
statistics to the future of the social security, for calculating - 
contributions, benefits, reserves etc. particularly in case of old age - 
pensions and invalidity pensions. The vital statistics are of great 
use in knowing the extent of influence of age on capacity to work, 
and the extent to which labour laws are being. ирешер 
regarding employment of children. : 
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The condition of the vital statistics in India is extremely 
hopeless and the figures of births and deaths are grossly inaccurate, 
misleading and unfit for statistical analysis. There is no standard 
method of registration of births and deaths prevailing throughout 
the country. Several Committees and Commissions have drawn 
attention to the shortcomings of Indian vital statistics. The Royal 
Commission on Labour, observed, "Lack of appreciation of their 
(vital statistics) value in public health and of training on the 
part of individuals responsible for their collection lead to 
continuance of grave inaccuracies in such records." Again the 
Bhose Committee in 1946, brought out the chief defects of Indian 
vital statistics. The Census Report of 1951 too pointed out the 
main defects and suggested that both census figures of births and 
deaths and calculation otherwise should tally. 

Vital statistics are collected in India under the Births, Deaths, 
and Marriages Registration Act 1886, providing only for voluntary 
registration of births and deaths. The work of collection of these 
figures is undertaken by the municipalities in the urban areas, 
and by the village officials like Patwari, Chowkidar in the rural 
areas. Though the data recorded by municipalities are not very 
accurate, but such data for rural areas are highly unreliable. In 
the absence of provisions for compulsory registration of births and 
deaths, the vital statistics of India are of doubtful accuracy because 
of several reasons, viz. (i) the Act is not applicable throughout 
India. (ii) Reporting is incomplete even in urban areas. 

The importance of vital statistics is being gradually realised. 
After the independence of the country such statistics are being 
published through the Statistical Appendices to the Annual Report 
of the Director General of Health Services. Information is 
published state-wise and split into rural and urban areas. The 
main heads under which data are collected and published are 
births, deaths, infantile mortality, deaths by causes, maternity, 
death rates, vaccination statistics and sickness and mortality of 
prisoners in jails. Now this work has been transferred to the 
Registrar General. This is really a step in the right direction. 


Suggestion for Improvement of Vital Statistics 


= j—Registration of births and deaths should become compul- 

= sory for everybody and an organisation should be set up 
| down to the village level. The system of free post cards 
prevalent in some western countries can be introduced at 


in urban areas. 
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2—The vital statistics should be closely linked with population 
statistics. 

3—Sex, age of mother, order of birth in the birth records 
and sex and marital status of the deceased are essential 
for calculation of gross and net reproduction rates. Hence 
detailed information should be collected. 


4—As the characterisation of disease implies a fair degree 
of technical knowledge, it is suggested that the Government 
or District Board physician and surgeons, should be asked 
to help the primary agency in this task. 


CHAPTER 6 


NATIONAL SAMPLE SURVEY 


The Bowley Robertson Committee in its report submitted to 
the. Government of India strongly advised the Government to 
compile statistics of economic resources of the country so that 
satisfactory estimates of national income may be made. The 
Committee suggested that economic surveys be conducted on the 
basis of random sampling method. Nothing was done to imple- 
ment this. recommendation of the Committee by the then 
Government. After independence the Government paid due 
attention towards collection of economic facts regarding the 
country, because this was necessary in the context of economic 
planning. Prof. P. C. Mahalanobis, the director of Indian 
Statistical Institute, Calcutta, prepared a scheme of National 
Sample Survey. The scheme was approved by the Government 
and in 1950 N. S. S. organisation was set up in the department 
of economic affairs in the Ministry of Finance. This organisation 
is an important agency for the continuous collection of reliable 
statistical data on random sample basis in order to fill up the 
gaps in the available statistical information about different 
problems. It is a pioneering attempt at the application of random 
sampling for collecting information on a wide range of subjects 
covering the whole country. The activities of the N. S. S. 
organisation are :— 

(i) Collection of data on social economic conditions, 
production of small-scale household enterprises, 
consumption and agricultural statistics. 

(ii) Collection of data relating to the organised industrial 

sector of the country. 

(ii) Supervision of the surveys conducted by States in 
agricultural sector through their own agencies and also 
giving guidance to states for analysing and co- 

ordinating the results of these surveys. 
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In other words, there are three functions of the N. S. S; 
namely (а) to conduct Socio-economic surveys, (b) to collect 
industrial statistics, and (c) to give technical guidance. 

The work of collecting statistical data under N. S. S. scheme 
started on Ist October, 1950 and the first round was completed 
in March 1951. The N. $. $. organisation, called Directorate 
of National Sample Survey which is now working under the 
Central Statistical Organisation, works under the technical 
direction of the Indian Statistical Institute, Calcutta. The 
Surveys conducted by N. S. S. are arranged in a number of 
successive rounds. Besides rounds the N. S. S. also conducts 
ad hoc surveys about a large number of problems. The second 
round started in April 1951 and ended in June, 1951. The 
N. S. S. has completed 18 rounds and is conducting 19th round. 
The reports of all the rounds are not yet published. In these 
rounds a large variety of statistics related with small-scale industry, 
employment, consumers expenditure, Land holdings and utilisa- 
tion, yield, livestock, domestic industries, birth and death rates, 
agricultural labour have been collected. ‘The’ Indian Statistical 
Institute, Calcutta designs and plans the surveys and also prepares 
the schedules and instructions for use of the staff. It also 
analyses and tabulates the data and prepares the report. 

The First round of survey. The first round was started 
in October 1950 and was completed in March, 1951. For this 
round out of approximately 560,000 villages in the country a 
sample of 1833 villages scattered throughout the country was 
selected for investigation. Of these villages a few were situated 
in parts of the country difficult to approach either due to danger 
from wild animals or due to the difficult physical conditions. 

These 1833 villages were divided into two groups again 
scattered throughout the country. For each of these two groups 
different schedules were used for enquiry. j 


The first group of villages consisted of 1189 villages for 
which the schedules: had been prepared by the Indian Statistical 
Institute, Calcutta. For the second group of villages consisting 
of 644 villages the schedules had been prepared by the Gokhale 
Institute of Poona. The technique of the selection of villages is 
quite instructive and interesting. As has been seen earlier, the 
whole country was divided into 250 geographical strata represent- 
ing conditions of various economic, social ‘and regional orders. 


po 


In each stratum, the number òf villages was so fixed that it was — 
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divisible by 3, so that they may be divided into two groups in 
a ratio of 2: 1. 

The first group of 1189 sample villages had separate schedules. 
which were of the following four kinds :— 

(1) Village schedules for listing all households of a sample 
village for collecting information on (i) land utilisation ; (si) 
prices of selected commodities, e.g. cereals, pulses, oils, vegetables 
etc. ; and (iii) rates of daily wages of various types of skilled and 
unskilled workers. 

(2) Household schedules (first set) for collecting information 
relating to: (i) demographic and economic conditions, e.g. age; 
sex, marital status, economic and employment status ; and (ii): 
holding of land and its utilisation under various categories. 

(3) Household schedules (second set) for a smaller number 
of households compiling detailed information on household 
enterprises and activities relating to: (i) agriculture and animal 
husbandry, and (ii) industry, crafts, trade services and professions. 

(4) Household schedules (third set) for smaller number of 
households for compiling detailed information relating to value 
and if possible of quantity of consumption of: (i) food and 
beverages ; (i) fuel and light; (їй) rent; (iv) -clothing ; and 
(v) miscellaneous items. 


Procedure for collection of information and difficulties 


Different material was available in different regions ; therefore 
different methods of selecting the sample-units were followed in 
different parts of the country. 'The sample-units were selected in 
two stages as follows : 

1. Stratification. All the states were divided into 160 strata 
on the basis of geographical contiguity and. topographical homo- 
geneity. Then 4 sub-strata were formed as follows in those strata 
for which population figures of individual villages were 
available :— 

I Sub-stratum with population of 1 to 499 


IL {Эй дї) 2 > » » 500 to 999 
III 5 >. ED > » 1000 to 1999 
IV Um » 33 » 2c 2000 and above. 


sii Thus, according to availability of population figures, sub- 
strata could be formed in only 32 strata. These were divided 
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into 128 ultimate strata. Thus, there was an aggregate of 256 
strata (i.e. 128 ultimate and 128 strata). 

2. Selection of villages. Within each strata villages were 
selected depending upon the nature of information available, 
according to five different procedures, as follows :— 


(i) Equal probability of selection of each village. 

(ii) Probability in proportion to village area. 

(ii) Probability in proportion to village population. 

(iv) Probability in proportion to village area but separately 
within each sub-stratum. 

Probability in proportion to village population but 
separately within each sub-stratum. 

3. Selection of households. 'The selection of households was 
made in several stages as follows :— 

(i A complete enumeration of all the households was 
made in the sampled village. 

(ü) A random sample of 80 households was made for 
collecting information relating to occupation. 

(Zi) These 80 households were classified into two groups 
(a) agricultural and (b) non-agricultural, and out of 
these two groups 8 households from each were selected, 
thus making a total of 16 households. General 
particulars were collected from these 16 households. 

(iv) From these 16 households 2 agricultural and 3 non- 
agricultural households were selected at random for 
collecting particulars of household enterprises. 

(v) Of the remaining 11 households, one agricultural and 
two non-agricultural households were selected at 
random for obtaining information on consumer 
expenditure. 

The Second round was confined to the study of consumption 
of rural households and their distribution according to consumer 
expenditure. 

In the Third Round urban areas were included. 

In the Fourth Round the design of the urban areas remained 
the same but the design of the rural areas was totally changed. 

The Fifth Round included a survey of industrial production. 

The Sixth Round had a wide scope seeking to collect com- 
prehensive information on a wider variety of Ч 
economic and social items. 


= 


(v 


——M— 


NATIONAL SAMPLE SURVEY 87 


The Seventh Round was made more intensified. 

The Eighth Round related to land holdings with particular 
reference to operational holdings. 

The Ninth Round was concerned with data regarding 
employment. 

The Tenth Round was related with crop cutting experiments. 

In the Eleventh Round information relating to weights and 
measures was collected. 

In the Twelfth Round another sample of 1848 was taken for 
survey. 

In the Thirteenth Round information was also collected on 
behalf of National Book Trust about the reader's preferences 
regarding size, prices and subjects of books. 

In the Fourteenth Round information was collected regarding 
Family Budgets and working conditions. 

In the Fifteenth Round detailed information was collected 
pertaining to various aspects of economic lives of the people. 

In the Sixteenth Round information was collected relating 
to ownership of land including cultivated and others, operational 
holdings and unemployment. 

The N. S. S. also conducted a number of ad hoc surveys, 
some of them are :— 

1—Survey of displaced persons in West Bengal and Bombay 

for the Fact Finding Committee, Ministry of 
Rehabilitation. 
2_А survey of the habit of newspaper reading for the Press 
Commission, Ministry of Information and Broadcasting. 
3—A survey of household consumption by expenditure levels, 
for the Taxation Enquiry Commission, Ministry of 
Finance. 
4—A survey of housing conditions for Ministry of Works, 
Housing and Supply. 
5—A survey of unemployment for Planning Commission in 
Calcutta. 
6—Mysore population study, for the United Nations and the 
Ministry of Health. 
* 9_For the Ministry of Labour, family budget enquiry was 
conducted at 50 centres for constructing cost of living 
index numbers. 
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8—For the C. S. О. a study of 6000 families was made at 
45 centres for constructing cost of living index numbers 
for the middle class. 


The №. S. S. has done commendable work since its inception. 
Its scope of work is fast increasing. The organisation has collected 
fairly reliable statistics on certain important items relating to 
social, economic, and demographic characteristics of people in the 
rural as well as urban areas. 


NRT LENE SE AOE FORT ETE жт 


CHAPTER 7 


PRICE STATISTICS 


Changes in the price level affect all individuals in some form 
or other, therefore, in order to measure such effects accurate 
price statistics are of great importance. Theoretically the 
collection of price statistics is the study of relationship between 
the value of principal commodities in terms of money and of 
those commodities generally whose use forms the very basis of 
standard of living of the people in the country. It has therefore, 
become a very important part of the statistical organisations in 
almost all the countries of the world to maintain regular statistics 
of prices.. In our country too, price statistics occupy an important 
place in the statistical system of the. country. Such. statistics 
become all the more important in a planned economy. The price 
statistics available in the country can be studied under following 
headings : 

(i) WHOLESALE PRICE STATISTICS 


The office of the Economic Adviser to the Government of 
India, at the centre and the Directorates of Economics and 
Statistics and Statistical Bureaux in the States collect wholesale 
prices of various commodities. These bodies collect information 
from official sources like customs houses and the State Bank and 
non-official sources like Chambers of Commerce and private 
business houses. ' The States have their own agency for collecting 
statistical material like District Statistical Officer or Economic 
Intelligence Inspectors in every district and marketing inspectors. 

"Wholesale prices of Certain Staple Articles of Trade at Selected 
Stations in India :—The Office of the Economic Adviser, Govern- 
ment of India. (Ministry of Commerce and Industry) collects 
weekly prices of those commodities which are included in the 
“Index Number of Wholesale Prices’. These are the commodities 
which occupy an important place in the wholesale trade of the 
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country. 'These commodities are divided into 5 groups and 16 
sub-groups. The monthly prices are also available in the *Monthly 
Abstract of Statistics’, 


'The Directorates of Economics and Statistics in certain states 
collect weekly wholesale prices of certain articles prevailing in 
the chief markets of that state. Such prices are published in the 
bulletins issued by the Directorates. 


General Wholesale Price Index Numbers 


General purpose wholesale price index numbers are compiled 
by the office of the Economic Adviser to the Government of India. 
The main indices prepared by his office are :— 

A—Economic Adviser's Sensitive Index Number of Wholesale 
Prices 

This index number was started during the Second World 
War. The base period of this index was the week ending 19th 
August 1939. Twentythree commodities divided into four groups 
namely, (i) Food and Tobacco group, (ii) Other agricultural 
commodities group, (Ш) Raw materials (non-agricultural) and 
(iv) Manufactured articles group, were included in this index. 
This was an unweighted index number which used simple 
geometric mean in its construction. The Economic Adviser also 
prepared an index called the ‘primary commodity index’ by 
averaging the commodities included in the first three groups, and 
another ‘index of chief articles of exports’ was also prepared taking 
into account 14 items out of the list of 23 items. 


This index number was severely criticised because the number 
of items included in the series was very small. As many important 
items were left out of account it was not a representative index. 
Thus this index number was discontinued after December 1947. 


B—Economic Advisers Index Number of Wholesale Prices 


Due to the criticism of the war time sensitive index number the 
Economic Adviser prepared a new scheme for the compilation 
of a wholesale price index number suitable for the general purpose 
in 1944. According to the scheme indices were to be prepared 
in five stages. The scheme was started in the year 1944 and 
was completed in 1948. The indices were prepared for five major | 
groups of commodities separately, and by combining these five — 
indices an index number for the all commodities was also no * 
"The details of these indices are :— E. 
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Commodities included i 

This index number included 78 commodities divided into 5 
major groups and 18 sub-groups. Six index numbers are published 
every week, one for each group and one combined for all the 
groups. In all total 230 quotations of commodities are included. 
The prices taken are those charged by manufacturers or importers 
or those prevailing in the wholesale markets. Prices ruling on 
Friday or about Friday were taken. 


Base Year 

The base period of the index is the year ended August, 1939. 
Average Used 

Weighted Geometric Mean is used in the compilation of the 
index number. 


Weightage given 


Major Group Weight Sub-group Weight 
Se WE. 7 ———— 
laa Em ES 
I—Food Articles 31 (a) Cereals 59 
(b) Pulses 8 
(c) Others 33 
100 
1L Industrial raw (a) Fibres 3 
Materials 18 (b) Oilseeds 30 
(c) Minerals 10 
(d) Others 7 


100 
ee 
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Major Group Weight Sub-group Weight 
ҮТТЕК ШҮК НАН 1 sui adu mti aeri ai Е 05 
III—Semi-Manufactures 17 (а) Leather 8 

(b) Mineral oils 13 

(c) Vegetable oils 16 

(d) Cotton yarn 35 

(e) Metals 18 

(f) Oilcakes 5 

(g) Others 5 

100 

IV—Manufactures 30 (a) Textile products 64 

(b) Metal products 17 

(c) Other finished 

products 19 

100 
V—Miüscellaneous 4 
100 
100 


Compilation process 


The weekly quotations of various commodities are first 
converted into price relatives. Simple geometric mean of the price 
relatives of several quotations gives the commodity index. 
Weighted geometric mean of the various commodity indices within 
a sub-group gives the sub-group index and the weighted geometric 
mean of the sub-group indices gives the group index. The 
weighted geometric mean of the group indices gives the General 
Index or All Commodity Index. This is the Economic Advisers: 
Index Number of Wholesale Prices. 

From weekly indices monthly index number is compiled and 
from monthly indices yearly index number is compiled. 


Criticism of the Index Number 
Though the Economic Advisers Index Nnmber of Wholesale: 


Prices is supposed to be the best index number and it is the most — 


~ 
E. T 
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popular index number, but it is open to certain criticism on а 
number of grounds. "The main criticisms against the index number 
are as: follows :— 

1—The weights used in the index number are very old and 
out of date." Moréover the manner in which the weights have 
been selected is also defective. 

2. "The number and the character of the quotations used in 
the index number are also not representative. There are as many 
as 8 quotations for shoes and only 3 for rice. 

3—The grouping of the index number is also not free from 
defects. 

4—1 should include a larger number of items so as to make 
it more representative. 

C—Economic Advisers (Revised) Index Number of Whole- 
sale Prices 

Due to a large number of defects it was felt necessary to 
overhaul the index, number. Its base period needed a change. 
Weight system. also needed revision. It was also, necessary to 
include more items зо. аз to make it more representative. There- 
fore, the office of the Economic Adviser took steps to revise the 
index number. The revised series included 112 commodities and 
555 quotations as against 78. commodities and 230 quotations 
included in the former index number. The new commodities 
included are :— 

Maize, Barley, Ragi, Potatoes, Onions, Oranges Milk, 
Bananas, Ghee, Fish, Eggs, Meat, Sugar cane, Hemp; Foreign 
cotton, Tanning materials, Lubricating oils, Aviation spirit, Diesel 
Oil, Electricity, Bamboos, Aluminium, Tin, Lead, German Silver, 
Handloom cloth, Hosiery Goods, Coaltar products, Medicines, 
"Tools, Bobbins, Leather belting, cycles, Plywood, Tea Chests, 
Pottery Goods, and Lime. 

The choice of the markets has been made taking into consi- 
deration the place of commodity in the national economy and the 
representative character of the markets and the recommendations 
of the Agricultural Prices Enquiry Committee (Thapar Committee) , 
and the opinions of leading Chambers of Commerce. 

© The base year of the revised index number is 1952-53. The 

“revised index number has two new groups namely, (i) Liquor and 

tobacco and (ii) Fuel, Power, Light and Lubricants. The misce- 
i llaneous group was discontinued. 
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The weights assigned to various commodities are based on 
the estimates of marketed value of domestic produce and the value 
of imports inclusive of duty. As regards manufactures weights 
have been fixed in accordance with the data for gross value of 
products as obtained at the Third Census of Manufactures 1948 ; 
imports have also been taken into account. In regard to inter- 
mediate products only the portion produced for sale has been 
considered. In the case of electricity the weight is based on energy 
sold by the electricity undertakings and valued at the average all 
India rate. Petroleum data are based on consumption figures. The 
weights refer to the post-partition period 1948-49. 


The weights of major groups and sub-groups are :— 


Major Group Weight 


I—Food articles 504 


II—Liquor and Tobacco 21 
III—Fuel, Power, Light 


& Lubricants 30 
IV—Industrial raw 
materials 155 
V—Manufactures 290 
(i) Intermediate 
products 41 
(ii) Finished 
products 249 


Sub-Group Weight 
(a) Cereals 192 
(b) Pulses 43 
(c) Fruits and 

Vegetables 23 
(d) Milk and Ghee 84 
(e) Edible oils 47 
(f) Fish, Eggs & Meat 17 
(g) Sugar and Gur 48 
(h) Others 50 
(a) Fibres 61 
(b) Oilseeds 60 
(c) Minerals 2 
(d) Others 32 
(a) Textiles 147 
(b) Metal products 12 
(c) Chemicals 20 
(d) Oil cakes 9 
(e) Machinery & 

Transport | 31 


(f) Others 50 


— — ——————— 
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This revised index uses weighted arithmetic average in place 
of geometric mean. 

The office of the Economic Adviser publishes every week the 
revised as well as the old set of index numbers. Index numbers are 
available for each major group, sub-group and for individual 
commodities. 


An Appraisal of the Revised Index Number 


The revised index number is a representative index number. 
Its scope has been extended by inclusion of a number of more 
commodities. Still, it is considered that in view of developing 
economy the base year 1952-53 has become obsolete. It should 
be replaced by more recent year, preferably 1960-61, because due to 
execution of three five-year plans the economic structure of the 
country has become changed. The number of commodities should 
also be increased to 150 from 112. 


Other Index Numbers 


(a) Index Number of Wholesale Prices in India and Some 
Other Principal Foreign Countries :—This index number is pre- 
pared by the U.N, office with 1953 as the base уеаг. It is published 
in the Monthly Bulletin of Statistics issued by the U.N.O. 

(b) Index Number of Wholesale Prices in Calcutta :—This 
is the oldest index number. Formerly it was compiled by the 
Director General of Commercial Intelligence and Statistics, and 
was published in the Indian Trade Journal, -Now this is compiled 
by the Statistical Bureau of the West Bengal Government. It is 
an index of monthly prices. Its base year is 1914, and it includes 
59 commodities. 


Gi) RETAIL PRICE STATISTICS 


A large number of daily, weekly, monthly and annual papers 
and journals contain retail prices, but such data are unsuitable for 
economic analysis. Such data are not properly and scientifically 
collected. Their scope and coverage are also uncertain. The 
Directorates of Economics and Statistics of the States also publish 
retail prices of certain commodities. The Salt Commissioner 
collects retail prices of salt which are published in the Statistical 
‘Abstract of India. The Reserve Bank of India publishes weekly, 
monthly and annualy the retail prices of Gold and Silver. 
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As regards index numbers of retail prices, the chief index 
number compiled in the country are :— 


A—Labour Bureau Index Number of Retail Prices (Urban 
Areas) ;— Ihe Labour Bureau, Ministry of Labour, Government 
of India compiled and published index numbers of retail prices of 
18 selected urban centres of the country. They were monthly 
index numbers and no weights were given to the items included. 
The items were divided into three groups namely, (i) АП articles 
of food, (ii) Fuel апа Lighting and (iii) Miscellaneous. 


Thé base year was 1944. Weekly quotations were collected 
for this. Now tlie compilation of this index number is discontinued 
and in its place simple price relatives of certain selected articles of 
consumption for those centres are published with base of the 
calendar year 1949. 


B—Labour Bureau Index Number of Retail Prices (Rural 
Centres) :—The Labour Bureau also compiled and published index 
numbers of 11 selected rural centres in different parts of the country. 
This index number was also monthly and unweighted. 
The items included їп the index number were grouped into four 
categories namely, (i) All food articles, (ii) Fuel and Lighting, 
(iii) Clothing and (iv) Miscellaneous. The base year was 1944. 
‘These indices аге also discontinued. Now the Labour Bureau 
publishes the price relatives of certain selected commodities for 12 
rural centres with 1949 as the base year. 

C—Consumer Price Index Numbers:—The Labour Bureau 
compiles and publishes price index numbers for a large number of 
centres in the country. The Labour Bureau also publishes the 
consumer prices of certain commodities at different centres. The 
Directorates of Economics and Statistics in States also compile such 
index numbers of certain selected centres of the state. 

Suggestions for the Improvement of Price Statistics. The 
retail price statistics of the country are, very inadequate and are of 
very poor quality. The Thapar Committee made a number of useful 
suggestions regarding the collection of retail prices statistics. There 
is much duplication in the collection of such data which needs 
co-ordination. A few, j suggestions; for the improvement of price. 
data are given below :— A ЖЕМЕ ТЕЗ 

1—As there аге а number of зале а ofa commodi the; 
market it is essential tó have! standardisation of. 
As in our country (‘Ag mark’ scheme is becoming: 
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such varieties should Ье selected for inclusion in the regiman 
of the index number. 

2—The quality of price statistics would improve if there is 
classification of markets and 'separate quotations are available for 
various types of markets in the country. The figures of the 
regulated markets should be available separately. 

3—The conceptual differences about various terms should be 
removed. These differences make the data collected by different 
states uncomparable. 

4—There should be better co-ordination between various 
statistical organisations. 

5— Proper attention should be paid to the selection of the 
representative centres. Centres should be selected on the basis 
of their importance and not on the basis of convenience of 
collection of data. 

6—There is scope for improvement in the method of quoting 
prices and obtaining modal price quotations. 

7—With the standardisation of weights and measures the 
"quality of data will improve. 

8— The price statistics are published by different agencies in 
different publications. Such data should be made available at 
-one place. 

9— The delay in publication should be avoided at any cost. 


(iii) STATISTICS OF SECURITY PRICES 


Prices of Government and commercial enterprises аге 
published in a number of journals and bulletins. 
Indices of Security Prices :— 
(a) Economic Advisers Series :—The office of the 
Economic Adviser used to compile index numbers of 
security prices upto the year 1949. It covered 150 
scripts and its base year was 1927-28. 

(b) Old Series of The Reserve Bank of India :—Since 
January 1946, the Reserve Bank of India started a 
weekly series of security price index number with 1937 
as the base year. The quotations of scripts were 
obtained from the published lists of Bombay, Calcutta 
and Madras stock exchanges. It included 398 scripts, 
which were selected on the basis of the importance 
of the concern and activity of the script in the market. 
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The scripts were divided into three groups. They 
were, (i) Government and semi-Government securities, 
(ii) Fixed dividend. industrial securities and (iii)| 
Variable dividend industrial ‘securities, The first 
group was divided into three, the second. into nine 
and the third into nineteen sub-groups. Two sets of 
index numbers were used to be prepared one the 
regional index number and the other АП India index 
number. 

New Series of the Reserve Bank of India :—The new 
series was started from August 1953 with backward _ 
calculations from April 1953. Та this series а number — 
of new scripts are included while old ones are dropped. 
The new grouping of the securities is :— 


' 1—Government and semi-government securities with three 1 
sub-groups. 


2—Debentures of industrial concerns with 8 sub-groups. 
3— Preference shares with 13 sub-groups. 


4—Variable dividend securities with 5 sub-groups and 23 — 
smaller groups. . 


СНАРТЕЕ 8 


INDUSTRIAL STATISTICS 


India is an agricultural country. The industrial development 
of the country was not given the due importance. by the foreign 
rulers. In the modern concept economic development means 
industrialisation. The growth of our industries being recent, 
industrial statistics of the country have not been properly developed, 
because their need was not acutely felt. It is only recently that 
attempts have been made to improve the industrial statistics. 

In industrially advanced countries there are adequate industrial 
statistics, throwing light on every aspect of the industrial economy. 
Prof. Neiswenger states that it is only in an industrially developing 
economy that there arises an imperative need for the collection, 
compilation and analysis of various types of statistics. This shows 
that collection of statistics is of great use in the industrial economies. 
Generally the industrial advanced countries collect statistical data 
on the heads given below :— 

(i) Capital Structure :— Under this heading statistics tegard- 
ing (a) Authorised, Issued and paid up capital, (b) Fixed capital 

‚ (с) Working capital (d) foreign capital investments are collected. 
E o) Employment l Statistics are collected about the number 
of persons employed under different categories and wages and 
salaries paid to them. Figures of man-hour worked, industrial 
disputes, absenteeism, labour turn-over etc. are also collected. 
зр (iH) Inputs :—Input and output analysis has assumed great 
significance in recent years Such statistics are collected in 
- advanced countries in great detail. Figures relating to both quan- 
шу and value of each industrial input, like raw materials, power, 
| stores consumed etc. are obtained. Thus cost of production is 
‚ ascertained. 
(iv) Outputs :—Figures relating to the quantity and value of 
the main product and by-product are also collected. 
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(v) Other data :—Data regarding details of power consumed, 
potential expansion and maximum capacity of the units etc. are 
also collected. 

Indian Industrial Statistics. Industrial statistics available in 
our country prior to independence were extremely meagre. What- 
ever improvements have been made in improving industrial data 
are of recent origin. Thus we shall study the available statistics 
in two parts namely one relating to pre-independence period and 
the other relating to post independence period. 


Pre-independence period :—Industrial statistics upto the year 
1947 can be studied under four headings namely : 


1—General Statistics. The general industrial statistics 
include data about the number of factories, number of persons 
employed in them, and the amount of capital invested in them. 
"These statistics were published in the following publications :— 

(i) Large Industrial Establishments in India—This publication 
was formerly issued by the Department of Commercial Intelligence 
and Statistics, but from the year 1946 its publication has been 
entrusted to the Labour Bureau, Ministry of Labour. This is an 
annual publication. The statistics given in this publication relates 
to the factories which come under the Indian Factories Act. Thus 
the data relate to such establishments only which employ not less 
than 20 persons. The factories are divided into ten major groups, 
namely :— 

(i) Textiles, (ii) Engineering, (iii) Minerals and Metals, 
(iv) Food, Drink and Tobacco (v) Chemicals, Dyes (vi) Paper 
and Printing, (vii) Processes relating to wood, stone and glass, 
(viii) Processes connected with skins and hides, (ix) Gins and 


presses (x) Miscellaneous. Each of these ten groups is further ` 


sub-divided into а number of smaller groups and the number of 
factories in each of the major and minor groups is given both 
districtwise as well as statewise. Separate figures are given for 
seasonal and perennial factories. "This publication also gives infor- 
mation relating to average daily number of persons employed. It 
15 obtained by the total number of attendance of all working days 
divided by the total number of working days. It also gives some 
information related to invested capital in different factories, but 
details regarding fixed, working capital etc. are not given. 

(ii) Statistical Abstract of India—This is an annual publication 
issued by the CSO. It also contains data regarding industrial 
economy of the country. 
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(iii) Statistics of Factories—This is an annual publication 
issued by Labour Bureau. It contains data regarding factories, 
number of persons employed classified into adults, children, male 
and female and labour welfare work. 


(iv) Report on the working of Joint-Stock Com panies—This 
publication now being issued by the Department of Company Law 
Administration. now merged with department of Economic Affairs, 
Ministry of Finance, contains data regarding joint-stock companies 
working in the country. Information given in this publication 
relates to the state-wise distribution of companies, number of newly 
incorporated companies and number of companies wound up, 
capital invested in companies in detail and companies incorporated 
in foreign countries but working in India. The publication is 
issued monthly as well annually under the same title. 


2 Statistics of Output and Cost. As regards statistics of 
output and cost, there were hardly any data worth the name 
available in'the country. Some statistics were available of the 
output but there were no data concerning cost of production. 
These statistics were available in :— 


(i) Monthly Statistics of Cotton Spinning and Weaving 
Mills—In 1926 an Act, Cotton Industry (Statistics) was passed, 
and the data collected were published in this publication. Under 
this Act figures were collected about particulars of all cotton goods 
manufactured, description and weight of all yarn spun, amount of 
cotton pressed and consumption of Indian cotton in Indian mills. 


(ii) Monthly Statistics of the Production of Certain Selected 
Industries in India—This was a monthly publication issued by the 
Department of Commercial Intelligence and Statistics. It contained 
information about the production of jute manufactures, paper, iron 
and steel, petrol, kerosene oil, cement, paints and heavy chemicals 
and wheat flour mills in India. In all these cases figures were 
supplied voluntarily by the factories. Besides these industries, 
information was also available about sugar and match industries, 
the information of them were based on the reports received under 
Sugar and Match (Excise Duty) Act 1934. 

(iii) Indian Trade Journal—It is weekly published by the 
Department of Commercial Intelligence and Statistics. It contains 
data regarding production of sugar including stocks of sugar. 

Besides these publications information was also available in 
Statistical Abstract of India and Monthly Survey of Business 
Conditions in India. | 
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3 Statistics of Power Consumed. The statistics of power 
consumed were given їп the Monthly Survey of Business Conditions 
in India (now merged with the Journal of Industry and Trade 
since 1951). Upto October 1942, information was given in the 
detailed form under seven heads, namely, consumption for 
Domestic, Commercial, Industrial, Tramways, Electric Railways, 
Street lightings and miscellaneous uses. Since November 1942, 
only total figures of the energy generated and total units sold for 
consumption began to be given. Upto the year 1943, such data 
were compiled by the Economic Advisers office, and since 1944, 
the Electric Commissioner to the Government of India compiles 
them. 

' 4 Statistics of Small-scale and Cottage Industries. As 
such industries were in the highly disorganised state there were no 
data concerning them. However some data were collected in 1921 
census and in the report of Indian Tariff Board Report of 1932. 

Thus we see that the condition of industrial statistics in the 
country before independence was highly unsatisfactory. 


Post-Independence period 


After independence the people’s Government realised the 
‚ importance of industrialisation for raising the standard of living of 
the masses, In 1948, the Government announced its Industrial Policy. 
The Government also followed. the. policy of planned economic 
development of the country, and it was decided to adopt a series of 
five year plans in order to implement programmes of economic 
development, Under such circumstances, it was but natural that 
the Government would not ignore the collection of industrial data, 
for planning and measuring the progress of planning in the country. 

Most of the industrial statistics collected at present is under 
the various laws enacted recently. We shall study the industrial 
statistics of the post-independence period under the following 
headings :— ; 


(i) Industrial Statistics Act 1942 :—Formerly the Government 
had no power to ask for information from the people. The 
Industrial Act gave this power to the Government. Its Section 3 
empowered the State Governments (then called Provinces) to 
collect statistics relating to any of the following matters : 


(a) Any matter relating to factories, ; 
(b) Any of the following matters so far as they relate to the 
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(I) Prices of Commodities, 
(II) Attendance, 


(11) Living Conditions including housing and water supply, 
and sanitation. 

(ТУ) Indebtedness, 

(V) Rents of dwelling houses, 

(VI) Wages and other earnings, 

(VII) Provident and other funds provided for labour, 

(VIII) Benefits and amenities provided for labour, 

(IX) Hours of work, 

(X) Employment and unemployment, 

(XI) Industrial and labour disputes. 

The Act applied to only those factories which were covered 
under the Indian Factories Act. It was also optional for the pro- 
vincial Governments to collect these statistics. The information 
collected under this Act was to be kept confidential. 

(ii) Census of Manufacturing Industries Rules 1954 :—The 
Industrial Act 1942 provided that the Provincial Governments should 
frame rules in exercise of the powers given to them. The Bombay 
Province framed such rules, but no other province followed the 
suit. In 1945, Directorate of Industrial Statistics was set up in the 
centre, and it was thought that all the Provinces should have a 
uniform set of rules. Accordingly the Directorate framed the rules 
and sent them for the approval of the Provincial. Governments. 
‘The Provinces were to adopt these rules. Section 3 and 4 gave 
the procedure of collection of data. It was as follows :— 

Every year before the end of December the statistical authority 
was required to send a notice to the occupant of each factory 
engaged in any industries given in the Schedule I, to supply the 
required information. Along with the notice, three copies of the 
form on which information was to be supplied were to be sent. 
The factory was to send back two copies duly filled in along 
with two copies of Balance Sheet and Profit and Loss Accounts. 

The objects of the census of manufacturing industries were :— 


1—To ascertain the contribution by manufacturing industries 
in the national income. 

9—To study the structure of each industrial unit, industry 
and manufacturing industries as a whole. 
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3—То analyse the factors that influence the industries of the 

country. 

4—To collect factual data for determining the state's 

industrial policy. 

Under the Census of Manufacturing Rules, 1945, industries 
were classified in 63 categories out of which 29 were included 
in Schedule I, Rest 34 were left to be covered in future. These 
29 industries were : 

(1) Wheat Flour, (2) Rice Milling, (3) Biscuit making, 
(4) Fruits and vegetables processing, (5) Sugar, (6) Distilleries 
and breweries, (7) Starch, (8) Vegetable oils, (9) Paints and 
Varnishes, (10) Soap, (11) Tanning, (12) Cement, (13) Glass 
and Glasswares, (14) Ceramics, (15) Plywood and teachests, (16) 
Paper and paperboard, (17) Matches, (18) Cotton Textiles, (19) 
Woollen textiles, (20) Jute Textiles, (21) Chemicals, (22) 
Aluminium, Copper and Brass, (23) Iron and Steel, (24) Bicycles 
(25) Sewing Machines, (26) Producer Gas Plants, (27) Electric 
Lamps, (28) Electric Fans, (29) General Engineering and 
Electrical Engineering. 

The Schedule contained the forms for each industry in which 
information was to be supplied. The data collected related to 
the, following points :— 


General Information :—It included general items like name 
of the factory, its location, present, address and address of the 
proprietor, Managing agents etc. 

Capital Structure :—Detailed information was collected about 
paid up capital, productive capital and the manner fixed capital 
has been invested. Detailed information about working capital 
was also collected. 


Employment and Wages :—Informations were collected about 
the number of persons employed, amount of salaries and wages 
paid to them, and man-hour worked during the year. 

Power Consumed :—Fuel, electricity, coal, gas lubricating 
materials and water purchased during the year ending 31st 
December. 


Materials Consumed :—Materials consumed during the year 
in the manufacture of products. 
Output :—Quantity and value of products and by-products. 


On the basis of Schedules the information about industries 
was collected annually and it was published in the CENSUS OF 
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MANUFACTURING INDUSTRIES. The Census of Manu- 
facturing Industries Rules were adopted in 1946, hence a 
statutory census could be taken from that year. However, figures 
were also collected for 1944 and 1945 on a voluntary basis. From 
1946 onwards the information was collected and published 
annually. In 1953, Collection of Statistics Act was passed which: 
became operative with effect from 10th November, 1956. But 
the new set of rules under this Act could not be framed till 1959, 
so that the annual censuses for 1957 and 1958 were again 
conducted on a voluntary basis. The Census of Manufacturing 
Industries gave information regarding :— 

1—Registered factories in existence. 

2—Factories from. which returns were received. 

3—Fixed capital employed. 

4— Working capital employed. 

5—Total capital employed. 

6—Number of workers employed. 

7—Number of persons other than workers employed. 

8—Total number of persons employed. 

9—Wages paid to workers. 

10—Salaries paid to persons other than workers. 

11—Money value of other benefits. 

12— Total salaries and wages paid. 

13—Value at factory of materials etc. consumed. 

14— Value of work done for factories by other concerns. 

15—Depreciation. 

16— Total of materials and Fuels consumed and depreciation. 

17—Factory value of products and by-products. 

18— Value of work done by customers. 

19— Total of products and by-products. 


20— Value added by manufacture. 

Drawbacks. The information collected under the 1945. 
Rules was much better and comprehensive as compared to data 
available before the passage of the Act. However it suffered 
from а number of defects, which were :— 

1—The definitions and concepts adopted in these statistics 
were defective, as they were borrowed from the Indian Factories. 
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Act апа Payment. of Wages Act. They were unsuitable for 
economic analysis. 

2—The schedules were not flexible and no change in them was 
possible without a very tedious and long legal formality. 

3— Their coverage was not complete, as they covered only 29 
industries out of 63. 

4—The forms on which statistics were collected were not sui- 
table for Government owned. factories. 

5—The publication of these statistics was delayed, thus much 
of the importance was lost. ` 

6—Schedules were in English and answers were also to be 
given in English. 

7—In fact the schedules: were framed on the lines of forms 
used in U. К. and U.S.A., where the producers keep detailed 
accounts. Hence they were unsuitable to Indian conditions.: 

On the whole it can be said that it was a step in the right 
direction, and whatever data were collected were not very 
unsatisfactory. 


SAMPLE SURVEY OF MANUFACTURING 
INDUSTRIES (SSMI) 


Besides, annual census of manufacturing industries by the 
State Governments, the Directorate of National Sample Survey 
conducted a Sample Survey of Manufacturing Industries since the 
year 1951. It covered all establishments registered under section 
2m (i) and 2m (ii) of the Indian Factories Act. In other words, 
those concerns using power and employing 10 or more workers and 
those not using power and employing 20 or more persons were 
included under the SSMI. Its scope was further extended to 
cover concerns licenced under the Industries (development and 
regulation) Act. The concerns situated in Andaman and Nicobar 
islands and those under Ministries of Defence and Railways were 


excluded from its scope. The chief items of the questionaire ofthe | 


SSMI were as follows :— 
1—Gppital; Structure: n 


(1) Value of fixed assets such as land and building and) e 


machinery etc., OP 
(ii) Value of working capital consisting of stock of fuel, 
raw ‘materials, products, by-products, semi-finished _ 
products and cash in hand есу = = 1 o 5 
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(iii) Rent of fixed assets secured on lease. 
(iv) Duration of working period. 

2—Employment, and Wages :—Employment figures with 
necessary breakdowns showing wages and salaries paid to different 
classes of employees. 

3— прив :—Value and quantity of consumption of fuels, raw 
materials and chemicals etc. including services received from other 
units, 

4—Output :—Value and quantity of products and by-products 
of the factory and services rendered to customers. 

The differences between Census of Manufacturing Industries 
and SSMI were :— | 

1—Тһе coverage of SSMI was wider than that of CMI. It 
covered all industries and its geographical coverage was also better. 

2—The quality of data collected by SSMI was also better. 
Data were collected through trained investigators who visited the 
sample units and got the forms filled in. That is why the National 
Income Committee preferred the data collected by SSMI to CMI 
data. 


3—The data collected by SSMI were published very late, The 
SSMI figures relating to 1954 were published in 1960. 


Upto the year 1958 CMI and SSMI were carried. It was 
felt that there was a lot of duplication in the collection of statistics 
under both the schemes, causing waste of time, money and energy. 
Tt was decided that the annual CMI conducted by the States and 
the SSMI conducted by the Directorate of NSS would be replaced 
by an Annual Survey of Industries. For this new Rules (Central) 
were framed in 1959, and from 1959 onwards an Annual Survey 
of Industries began to take place. 


"Collection of Statistics Act 1953. The Government having felt 
and experienced the difficulties and shortcomings of the Collection 


УЛО 


types of statistics from various industrial and business units. This 
Act came into force from 10th November 1956. The Act is 


co-operative societies, firms and individuals engaged in trade and 


commerce) and factories (as defined under Indian Factories Act) 
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and industrial concerns (public limited companies, co-operative 
societies, firms, individuals, engaged in manufacturing, assembly, 
packing, preservation or processing of goods or in mining or in 
generation or distribution of electricity or any other form of power). 
Under the Act the Government are empowered to collect data 
relating to :— 
(a) Any matter relating to any industry or class of industries. 
(b) Any matter relating to any commercial or industrial 
concern and in particular relating to factories. 
(c) Any of the following matters so far as they relate to the 
welfare of labour and conditions of labour namely : 
(i) Price of commodities 
(ii) Attendance : 
(ii). Living conditions including housing, water supply and. 
sanitation. 
tiv) Indebtedness 
(v) Rents of dwelling houses 
(vi) Wages and other earnings 
(vii) Provident and other funds provided for labour. 
(viii) Benefits and other amenities provided for labour 
(ix) Hours of work 
(x) Employment and unemployment 
(xi) Industrial and labour disputes 
(xii) Trade unions 


COLLECTION OF STATISTICS (CENTRAL) RULES 1959 


Under the Collection of Statistics Act 1953, rules were to be: 
framed regarding the form and manner in which the information. 
and returns may be furnished. But the Rules under the Act 
could be passed only in 1959 and were gazetted in January 1960. 
The section 3 and 4 of these rules deal with service of notice and 
particulars to be furnished, 

According to section 3 the statistical authority shall serve upon 
the owner of any factory, industrial concern or plantation a notice: 
requiring him to furnish : 4 

` (а) one or more returns in such manner and containing such, 
particulars as may be specified in the notice, (b) in case the 
concern is a joint stock company, a copy of the annual balance, 
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sheet and profit and loss account ; within the time specified in the 
notice (the time limit shall not ordinarily be less than three 
calendar months). 


According to section 4 the owner may be required to furnish 
all or any of the particulars as indicated in the notice ;— 

(1) Identification particulars (2) Nature of ownership and 
management (3) value of and expenditure on different components 
of fixed capital (4) value of and transactions on different compo- 
nents of working capital (5) details of employment including 
number of persons employed, man-hours worked, and payment made 
{ог different categories of employees, (6) value of privileges or 
benefits accruing to different categories of employees (7) number 
and power of different kinds of prime movers separately and for 
different types of motive force (8) number and strength of motors 
(9) installed capacity (10) details of consumption of fuel, electri- 
city and lubricants and their quantity and value (11) other 
materials and services consumed including raw materials, chemicals, 
packing materials and stores and services purchased (12) value and 
quantity of products meant for sale, including amount received for 
work done by the factory for other concerns, (13) sales to different 
‘types of customers (14) Stocks of fuels, materials and products, 
(15) inventory of equipment other than power equipment (16) 
present age, condition and service life of buildings and (17) 
any other particulars on which information may be supplied at the 
¿discretion of the owner. 

These rules are on the same lines as those of CMI Rules framed 
wunder the Industrial Statistics Act. 


ANNUAL SURVEY OF INDUSTRIES , 


With the operation of new rules the annual census of manu- 
facturing industries and sample survey of manufacturing industries 
have been replaced by an annual survey of industries. This step 
avoided the duplication of work. The annual survey of industries 
is conducted by the Director of NSS under the direction of CSO. 
Under the scheme of ASI two types of enquiries are conducted :— 
; (i) Census in respect of all factories employing on any day 
50 or more workers without the aid of power, and 

(ii) sample survey in respect of factories employing 10 to 49 
-workers with the aid of power and 20 to 99 workers without the aid 
of power and industrial concerns, happened to be selected in the 
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probability sample for the survey year under consideration. Under 
the sample 2566 concerns are selected on random sample basis, 


Scope and Coverage :—ASI extends to all factories registered 
under the Indian Factories Act, and defined as ‘industrial concern’ 
in the Collection of Statistics Act. However the following indus- 
tries were left out of its scope :— 


iron ore mining, metal except iron ore mining, stone 
quarrying, clay and sand pits, salt mining and 
quarrying chemicals and fertilisers and. mineral mining 
and non-metallic mining and quarrying not elsewhere 


classified, 


The establishments coming under the Ministry of Defence and 
Railways were like CMI and SSMI left out of its scope. Form :— 
Under Annual Survey of Industries one single form of returns has 
been’ designed to meet the requirements of both sensus survey and 
sample survey. 

Data Collected ;—The data collected under ASI scheme 
relate to :— 

Capital structure :—Details of fixed and working capital and 
transactions relating to fixed capital (replacements, improvements 
and expansions) during the year, 

’ Employment and wages :—Average employment and emolu- 
ments during the year; employment by categories etc. 

Inputs :—Raw materials, chemicals, packing materials and 
consumable stores consumed during the year. Work done by other 
concerns for repairs and manufacturing processes. Fuel and 
lubricants (excluding intermediate products consumed during the 


year). Other expenses not included in the materials and fuel and 
lubricants consumed. 


Output :—Quantity and value of manufactured products, by- 
products and intermediate products produced during the year. 
Work done for other concerns on repairs and manufacturing 
Processes. Value of semi-finished goods including work in 
process. 1 

Installed capacity :—Installed capacity of production during 
the year, its basis of estimation, spare capacity and expected addi- 
tional production. { 

Stocks :—Stock of raw materials, fuels, products and by- 
products at the end of accounting year, 


| 
! 
| 
| 
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Power Equipment :—Prime movers (steam engine, internal 
combustion engine and other prime movers) as at the end of the 
year. Also electric movers (AC and ШС) at the end of the year. 

This is clear from the above that the information collected 
under the ASI scheme was very much similar to that of CMI and 
SSMI. However for the first time data were collected on :— 

(a) Equipments other than power equipment installed. 

(b) Skilled, Semi-skilled and unskilled workers. 

(c) Installed capacity of production. 

(d) Sales affected during the year classified by the types 
of the consumers, 

(e) Labour and management relations. 

(f) Training facilities given by the factories. 

(g) Industrial research. 

Defects of the Annual Survey of Industries :— 

1—In the ASI, concepts and definitions are the. same as they 
were in the CMI and SSMI. Such definitions and concepts are 
related to the Indian’ Factories Act and Payment of Wages Act, 
which are unsuitable for economic analysis. ‘Fo take an example, 
the definition of ‘manufacturing process’ has been adopted from 
the Factories Act. According to this definition manufacturing 
process includes work done by laundries, cinema, dal making 
cashewnuts decorticating etc. which are not really manufacturing 
processes in the true industrial sense, Such confusions are also 
found about the concept of wages etc. 

9— The distinction between skilled, semi-skilled and un- 
skilled workers has to be very carefully interpreted. ‘Ех factory 
value’ has not been satisfactorily defined. Similarly concept of 
intermediate products needs proper definition, 


^ MONTHLY STATISTICS OF OUTPUT 


The Director of Industrial Statistics collects monthly statistics 
relating to the production and installed capacity of certain selected 
industries. These figures are voluntarily supplied by industrial 
- units. "The Directorate also makes use of data collected and 

furnished by Coal Commissioner, Chief Inspector of Mines, Indian 
< Tea Board, Salt Commissioner, Textile Commissioner, Iron and 
steel Controller, Geological Survey of India etc. These bodies 
- estimate monthly production on the basis of returns supplied to 
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them. At present 90 industries are included which are divided into 
three categories namely, (i) Mining and Quarrying, (ii) Manufac- 
tures and (iii) Electric light and power. 


Publications containing industrial statistics :— 
The following publications contain industrial statistics : 


1—Annual Survey of Industries (expected to be published by 
the end of 1963) 


2— Census of Manufacturing Industries (from 1946 to 1958) 


3—Sample Survey of Manufacturing Industries (from 1951 
‘to 1958) ; 


4—Monthly Abstract of Statistics (CSO) 

5— Statistical Abstract of India (Annual) (CSO) 
6—Journal of Trade and Industry 

7—Reserve Bank of India Bulletin (Monthly) 

8—Cotton and Jute Bulletins 

9—Statistics of Iron and Steel Industry and Trade Control. 


INDUSTRIAL FINANCE STATISTICS 


The industrial finance statistics at present published by :— 
1—Director of Industrial Statistics, 

2—Directorate of NSS. 

3—Department of Company Law Administration. 

4—Office of the Controller of Capital Issue. 


5—Individual Finance Corporations like I.F. C., I.C. I. C. 
etc. 


6—Reserve Bank of India. 


INDEX OF INDUSTRIAL PRODUCTION 


The chief indices of industrial production prepared in the 
‘country are :— 


(a) The Capital’s Index of Industrial Activity :—Capital, 
a financial weekly published from Calcutta preparing the 
index of industrial activity since 1938. All the industrial activities 
have been classified into six groups namely (i) Industrial production, 
(ii) Mineral production, (iii) Financial statistics (cheque 
clearances) (iv) Trade foreign and coastal, (v) Shipping-foreign 
and coastal, and (vi) Rail and River borne trade. These groups 
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are given weights 36, 7, 24, 20, 7 and 6 respectively. The base 
year is 1935. 

(b) Index of Industrial Production :—This index is published 
by the SCO in the Monthly Statistics of the Production of Selected 
Industries in India, with 1956 as the base year. Formerly its base 
year was 1946 and it included 20 industries. Later on base was 
changed to 1951 and 37 industries were included. Now the 
CSO has started a new series with 1956 as the base with 201 
commodities. 

(c) Eastern Economist’s Index of Industrial Production :— 
'The Eastern Economist a. weekly financial journal publishes an index 
of industrial production since August 1948. The year ending with 
August 1939 has been taken as the base year. It includes 11 com- 
modities divided into three groups which are weighted as follows :— 
A— Textiles 


]— Cotton textile weight assigned is 40 

2— Jute textiles Pon dem 17 

B—Fuel and Power NECS 10 
‘C—Miscellaneous 

1—Steel » » 0» 8 

2—Iron Ore э» 550025 7 

3— Paper > 33 ЖЕЙ 1 

4—Маїсһ E э 2» 2 

5—Paints > DU EU 1 

6—Alchohal m » o» 1 

7—Cement » 5» 3 

8--Sugar » » o» 10 

100 


In the calculation of this index weighted geometric mean is 


used. 
(d) Index Number of Industrial Profits :—Department of 


Gompany Law Administration prepares this index on the chain 
base method. It includes eight industries namely, cotton textiles, 
jron and steel, jute, cement, paper, sugar, tea and coal. 


STATISTICS OF COTTAGE INDUSTRIES 


There are no data available about small-scale and cottage 
industries of the country. In fact data should be collected for these 


industries on the same pattern on which data are collected about 


т s—8 
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large-scale industries. There should be data, complete and 
reliable, about inputs and output, employment, and capital invested. 
Besides private researches, some unco-ordinated and limited data _ 
are published by All India Handloom Board. All India Khadi and — 
Village Industries Board, Silk Board, Coir Board, Small-scale k 
Industries Board etc. The Director of the NSS is also conducting 
Bi-annual Survey from Ist April 1961. This survey is being tried. 
in a number of centres, like Calcutta, Bombay, Bangalore, Delhi, 
Kanpur and Madras. The data collected relate to, general, 
capital, loans, power used, employment, raw materials consumed, 
production, sales and stocks etc. The data are being collected on 
a trial basis, the final shape is yet to be decided. 

Thus there is need for the collection, compilation and publica- 
tion of data relating to small-scale and cottage industries, 


СНАРТЕК 9 


TRADE STATISTICS 


The statistics relating to trade of a country are of great use 
because they throw light on the process of distribution of commo- 
dities produced. In our country trade statistics arise out of 
compilation made in the course of administration of laws as those 
relating to taxation of imports and exports and from compilations 
made from returns received from railways and sales tax administra- 
tion. ‘These data are collected, compiled and published by the 
Director of Commercial Intelligence and Statistics. Indian trade 
statistics can be divided into two groups, namely, (i) Foreign Trade 
Statistics, and (ii) Inland Trade Statistics. 


(i) Foreign Trade Statistics 


The Director of Commercial Intelligence and Statistics, 
Calcutta, till 1952 published foreign trade statistics of the country 
in the following publications :— 

- {Accounts Relating to the Foreign ‘Trade (Sea, and Air 
borne) and Navigation of India. 

2— Accounts relating to Trade of India by land with foreign 

^'^ countries. 

The 2nd publication contained data relating to trade by land 
with Pakistan, Burma, and Iran. 

" In 1952, it was thought desirable to publish the entire 
statistics of foreign trade in a single publication. Аза result of 


this both the publications were merged into one, and the name 
of the new publication was ‘Accounts Relating to the Foreign 
Trade (Air, Sea and Land) and Navigation of India’. In 1956, 
the name of this publication was again changed to ‘Accounts 
Relating to the Foreign Trade and Navigation of India’. 

= Tn the year 1957, certain fundamental changes were introduced 
in the publication of foreign trade statistics. The changes were :— 
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1—The name of the publication containing foreign trade 
Statistics was changed to, ‘Monthly Statistics of the 
Foreign Trade of India’ Vol. I & II. The publication is 
in two parts, the first part contains data relating to 
exports and re-exports, and the second volume relates to 
data of imports. 

A Supplement to this publication is also issued which contains 

data regarding :— 


(a) Value of foreign trade, 

(b) Balance of Payment, 

(c) Foreign trade indices, 

(d) Foreign trade in treasure, 

(e). Foreign trade with selected foreign countries, 

(f) Value of imports and exports of chief commodities, 

(g) Foreign trade with each country and with each 
currency area. 


2—Upto the year 1956, foreign trade statistics were published 
on the basis of financial year (from April to March). 
From the year 1957 calendar year (from January to 
December) was adopted in order to make international 
comparisons easy. 

3—Another important change introduced in 1957 was in trade 
classification. The trade classification in use prior to 
January 1957 provided for the separate specification. of 
only 1717 items in the foreign trade statistics. This 
classification has now been replaced by the Indian Trade 
Classification providing specification of more than 4850 
articles and is based on the International Trade Classi- 
fication recommended by the Economic and Social 
Council of the United Nations. 

4—The figures of foreign trade given in this publication 
relate to the trade registered by customs authorities at 
Indian Sea ports, Air ports and Land Custom Stations. 
The land borne trade with Tibet, Nepal, Bhutan and 
Sikkim and trade arising in the Andaman, Nicobar, 
Laccadive, Minicoy, and Amindivi islands is included in 
the inland trade. 


5—The Monthly Statistics of the Foreign Trade of India 
contains the foreign trade statistics under the following 
scheme :— 
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A—Total foreign trade of the concerned month. ‘Total value 

and volume of imports and exports. 

B—Total foreign trade upto that month in that year. 

C—Foreign trade of that particular month of the last two 

years, for comparisons. 

64850 articles are classified into 9 sections as follows :— 

(i) Food, (ii) Beverage and tobacco, (iii) Crude materials 
(inedible except fuels), (iv) . Minerals, Fuels and 
Lubricants, (у) Animal and vegetable oils and fats, (vi) 
Chemicals, (vii) Manufactured goods, (viii) Machinery 
and Transport Equipment, (ix) Miscellaneous Manu- 
factured Articles. 

Each division is divided into groups, and each group is again 
divided into several sub-groups and each sub-group in its turn 
is divided into sub-groups. Thus the information is published 
in sufficient details. 

Besides ‘Monthly Statistics of the Foreign Trade of India’ 
following publications also contain data regarding foreign trade 
of the country :— 

1—Journal of Industry and Trade, 

2 Annual Statistical Abstract, 

3— Foreign Trade of India (Annual), 

4—Reserve Bank of India Bulletin, 

5—Customs and Excise Statement of Indian Union. 

Indices of Foreign Trade. The Director-General of 
Commercial Intelligence and Statistics also compiles Indices of 
Foreign Trade of India. ‘The Indices are compiled keeping the 
year 1958 as the base. These series relate to :— 

(i) Unit value Indices of imports. 
(ii) Volume indices of imports. 
[| Gii) Unit value indices of exports. 
| (iv) Volume indices of exports. 
(v) Index of terms of trade (ratio of exports price index 
to import price index.) 

These indices are compiled on a monthly basis and annual 
indices are also computed from these figures. These indices are 
published in the Monthly Statistics of the Foreign Trade of India 
and also in the Monthly Bulletin of the Reserve Bank of India. 
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... (ii) Inland Trade Statistics 


Inland trade statistics of India can be studied under following 
headings :— ; ja | 
' (a) Coastal’ Trade :—The : Director General Commercial 
Intelligence and Statistics compiles and publishes the coastal trade 
statistics of the country in the monthly bulletin ‘Accounts Relating 
‘to Coasting Trade and Navigation of India’, The country for 
this purpose has been divided into 9 maritime blocks. They are :— 


1—West Bengal | 7—Bombay 

2—Orissa ‘ . 8—Andaman and Nicobar 
3—Andhra Pradesh Islands. 

4— Madras 9—Laccadive, Minicoy and 
5— Kerala i Amindivi Islands. 

6— Музоге 


Trade: between ports in the same Block is classed as ‘Internal 
trade’ and that between one block with another block as ‘externial 
trade’. : 


As regards road-borne trade, there are no data available on 
this account, Though there has been enormous growth of the 


the following bulletins :— 


1—Indian Trade Journal. (Weekly) 
2—Raw Cotton Statistics, (Monthly) 
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3—Annual Statistical Abstract of India. 
4—Journal of Industry and Trade. (Monthly) 
5—Review of Trade of India. (Annual) 


Defects in the Internal Trade Statistics 


1—Trade within a block is not taken into account. 


2— Certain articles e.g. sale of commodities by the wholesale 
dealers are not valued properly. 


3—Value of retail trade is excluded. 


4—'There are no statistics available of the road-borne trade. 
This shortcoming should be removed as early as possible. 


5—The river-borne trade figures exclude the trade done by 
boats. The value of such trade is not in negligible figures. 
This should be included. 


6—' The commodities covered under rail and river borne trade 
are grouped into 31 classes. There is need for revising 
classification keeping in view of the extension of trade in 
recent years. 

7—There are no inland trade indices as we have for foreign 
trade. 

3— Рог the foreign trade data are collected in value as well 
as of volume, but for inland trade only statistics of volume 
are collected. There is need for collecting value statistics 
also. 

9—Inland trade should be divided into public sector trade 
and private sector trade so that progress of both may be 
measured. 

10—The figures of trade with Tibbet and Nepal should be 
excluded from the inland trade, as these countries are 
foreign countries. 

Thus we have to do a lot in order to make inland trade 

statistics of India, complete, accurate and reliable. 


СНАРТЕР 10 


FINANCIAL STATISTICS 


Financial Statistics in India Can be divided into two 
Classes : — 


(а) Statistics of Public Finance, 
(b) Statistics of Banking and Currency. 


Public Finance Statistics 


Detailed, comprehensive and uptodate public finance 
Statistics are available in India. Such Statistics are available in 
the Central and State Government Budgets. There are two 
budgets at the centre, one is called Genera] Budget and the other 
is called Railway Budget. There is а Separate budget for each 
of the states. Data regarding income and expenditure of local 
bodies like municipalities, Corporations, district boards are also 
available, The public finance Statistics are available in the 
following publications :— k 


(I) The Budgets of the Central and State Governments :— 
The budget of the Central Government is laid before Parliament 
in the month of Е, ebruary. + This budget refers to the income and 
outlay of a financial year. Similarly budgets of the states аге 
Presented in {һе respective legislatures. Several statistical 
Statements are placed along with the budgets, 


(1I) Combined Finance and Revenue’ Accounts of the 
Central and State Governments in India (Ministry of Finance). 
This publication contains mainly the actual cost receipts and 
disbursements during the financial year. The accounts are 
classified into—(j) Revenue, (ii) Capital, (iii) Debt and (iv) 
Remittance, Each of these classes is sub-divided into detailed 
Sections, 


(IIL) Statistical Abstract (CSO) :—This publication contains 
Statistics on public finance as divided into General Finance, 
Railway Finance, and finance of Local bodies and Municipalities. 
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(IV) Monthly Abstract of Statistics (CSO) :—This monthly 
Abstract also contains some statistics about public finance. It 
gives receipts, expenditure and debts of the Central and State 
Governments. It also gives yields from important taxes. 


(V) Report on Currency and Finance (R. B. I.):—This 
annual also contains data on public finance. 

(VI) Reserve Bank of India Bulletin (Monthly) :—This also- 
gives monthly data about public finance. 


Statistics of Banking and Currency 


The following publications contain data on this sector :— 


(I) Reserve Bank of India Statement of Affairs (Weekly) :— 
This publication gives separately the assets and liabilities of the 
Banking and Issue Departments of the Reserve Bank of India. 
'This information is reproduced in a number of journals. 


(II) Statement of Affairs of Scheduled Banks (Weekly) :— 
'This publication gives a consolidated statement about the financial 
position of the scheduled banks. It gives the position of the 
banks at the close of every Friday. It contains data about the 
various aspects of scheduled banks, e.g. their demand liabilities 
in India, time liabilities in India, cash in India, their balances 
with the Reserve Bank of India, their balances with other banks 
in India, money at call and short notice in India, Investments 
in India, advances in India and inland bills purchased and 
discounted in India. 

(III) Reserve Bank of India Bulletin (Monthly) :—In this 
bulletin details about assets and liabilities of the non-scheduled 
banks are published on the same pattern as that for the scheduled 
banks. 'The bulletin also contains statistical information on a 
number of other matters. 

(IV) Report on the "Trend and Progress of Banking in India 
(Annual) :—This annual publication reviews the developments in 
the field of banking during the year. 'The publication also 
discusses the impact of these developments on the national 
economy. It reviews the changes in the structure and liabilities 
of the commercial banks and their supervision and guidance by 
the Reserve Bank. The statistical appendix gives detailed tables 
regarding the liabilities and assets of Reserve Bank, consolidated 
position of scheduled banks, liabilities and assets in India of 
Banking companies, money rates in India, interest rates on 


122 АМ INTRODUCTION TO MODERN STATISTICS 


deposits. Analysis of investments of banks, advances of scheduled 
and non-scheduled banks, classification and distribution of banking 
companies, cheque clearances and velocity of circulation of deposit 
money etc. ` 


(V) Statistical Tables relating to Banks in India (Annual) :— 
This annual gives detailed data about the working of Indian and 
foreign banks in India. 

(VI) Statistical Statements relating to the Co-operative 
Movement in India (Annual) :—This publication gives details 
about the co-operative banks in India. Statistics are given 
regarding their number, membership, working capital, deposits and 
loans held, loans outstanding cash balances etc. The banks are 
classified into State Co-operative banks, Central Banks, Primary 
credit societies, Land Mortgage banks and other Societies, 


(VII) Report on Currency and. Finance (Annual) :—This is 
another important publication of the Reserve Bank of India. 
"This deals with economic statistics of India. It contains a wealth 
of economic statistics. 


СНАРТЕЕ 11 


LABOUR STATISTICS 


Labour statistics are of great use in economic analysis. For 
an industrial economy which is gradually advancing towards 
mechanisation, labour saving devices and rationalisation, the 
utility of such statistics is considerable. For measuring labour 
productivity, which is an important economic indicator, economic 
efficiency and employment output ratio, labour statistics are needed. 
Labour statistics can be studied under following headings :— 


(A) Employment Statistics 


The following are the sources of employment statistics :— 
1— Labour Bureau, Ministry of Labour, Government of India. 
2. Census of Manufacture. (From 1946 to 1958) 
3—Sample Survey of Manufacturing Industries. (Upto 1958) 
4—Annual Survey of Industries. (From 1959) 


Labour Bureau:—The following employment statistics are 
published by the Labour Bureau :— 


(i) Number of working factories and average daily 
employment. 

(ii) Number of registrations and placements effected by the 
employment exchanges and the number of employers 
using employment exchanges. 

. (iii) Number of persons undergoing training in the training 
centres and the number of training centres. 

(iv) Statistics relating to labour absenteeism. 


(v) Statistics relating to labour turnover. 
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Census of Manufacture :—Under this statistics were collected 
about employment of labour in various manufacturing industries. 
These statistics related to 29 industries only later on reduced to 
28. Employed persons were classified as follows :— 


Employed directly by factory 


Men 
Employed through contractors 


A. WORKERS (Adults) 
Employed directly by factory 


Mira Employed through contractors 
Bo Employed directly by factory 
Ki Employed. through contractors 
(CHILDREN) 
Girls Employed directly by factory 


Employed through contractors 
(B) Persons other than Workers 


I—Total number of man-hours worked during the year. 
П—Ауегасе number of persons employed рег day. 


Sample Survey of Manufacturing Industries :—The SSMI 
collected statistics of employed under following classification :— 


(1) Labour employed 
(a) Directly employed 
(b) Employed through contractors 


(ii) Number of other employees : 
Figures with regard to workers were collected 
separately for men, women and children. 


(iti) Average number of workers per working day. 


(iv) Salaries, wages and other emoluments paid to workers 
and employees. 


(v) Individual benefits paid in kind. 
(vi) Group benefits. 
(vii) Contributions to funds. 
(viii) Change in the volume of employment over the four 
quarters of a year. 
Annual Survey of Industries:— Under ASI statistics are 
collected about :— 
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(i) АП factories employing on any day 50 or more workers 
with the aid of power and 100 or more without the 
aid of power. 


(ii) АП factories employing 10 to 49 workers with the aid 
of power and 20 to 99 without the aid of power. 


(a) Labourers are classified into skilled, semi-skilled 
and unskilled categories. 


(b) Statistics are also collected about the training 
facilities given by the factories. 


(c) Average employment figures during the first week 
of each quarter of the year of workers employed 
directly or through contractors are also being 
collected. 


Besides these publications, employment statistics are also 
available in :— 


]—Annual Report of the Chief Inspector of Mines in India, 
which gives the number of workers employed in mines. 


2. "Tea in India :—D.E.S. Ministry ot Food and Agriculture 
in this publication gives average number of persons 
employed daily in each district ог estate, in the tea 
industry of the country. Separate figures are given for 
garden labour and outside labour, permanent labour and 
temporary labour. 

3—Monthly Abstract of Statistics (CSO):—The abstract 
gives the distribution of workers according to occupation 
or means of livelihood. 


4—Indian Coal Statistics :— This gives data about employ- 
ment in coal mines. 

5 Census of India :—This gives employment figures in 
various occupations. 

6—Indian Labour Year Book :—Such data are given in this 
publication also. 

7— Census of Central Government Employees :— (050) 
This census is an annual feature. It gives data about 
employment in the Central Government. 
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(C) Wage Statistics 


Wage statistics have a special importance in the context of 
economic development of a country. The statistics of employment 
indicates the extent of economic distress while the wage statistics 
indicates the economic status of the wage-earning class. The 
object of wage statistics is to describe the earnings of the people, 
to compare the rates of the wages for different times and places 
by occupation, sex and age and to find the relative number of 
persons working at each rate. It aids in creating a harmonious 
relationship between the employer and employee. 

In India wage statistics are not available to the extent 
desirable, and moreover there are no such data pertaining to rural 
areas. 


Difficulties in the collection of wage statistics in India :— 


1—There is no uniformity in the periods and the methods 
of payment of wages in the different areas and in the 
different industrial establishments. In many cases labour- 
ers are paid in cash as well as in kind and are also given 
some other concessions. Statistics of wages paid in 
factories and mines are available to a large extent from 
the returns received from factories. For other urban 
occupations such statistics are not normally available. In 
cases of rural occupations, such data are not available. 


2—Trregularity of employment and seasonal employment in 
itself is a problem, which creates difficulty in taking 
records of wage data. 


3—Until now in India the problem of fixing fair wage rates 
remained merely an academic question. It is necessary 
that the trained field staff for the compilation of the data 
regarding wages should be provided. 


Wage statistics of the country can be divided into two groups 
namely, (i) Industrial wage statistics, (ii) Agricultural wage 
statistics. 


Industrial wage statistics :—In India these data are very 
inadequate and unreliable. In 1873 publication of a six monthly 
bulletin was started under the title of ‘Prices and Wages’, but 
in 1905 its publication was discontinued. The Royal Commission 
on Labour criticised the existing state of affairs, and recommended 
the collection of accurate wage data in different industries at 
different centres. The Labour Investigation Committee (known 
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as Rege Committee) collected statistics of wages relating to 
certain industrial centres in India. 

The Labour Bureau, Ministry of Labour, Government of 
India, publishes data relating to per capita average annual earnings. 
collected under the Payment of Wages Act 1936. Under the 
provisions of this Act every factory registered under the Indian 
Factories Act sends regularly labour statistics to the Labour 
Department of the state concerned. The state departments send 
these data to the Labour Bureau for processing and publication. 


The Labour Bureau publishes these statistics in its Labour 
Journal under following categories :— 


1—Under the provisions of the Payment of Wages Act as 
amended in 1958, all persons are included whose earnings 
per month are below Rs. 400/- p.m. Formerly income limit 
was Rs. 200/- p.m. only. The information is given 
industrywise as well as statewise. The. industrywise 
figures are given under following heads :—(a) Bonus, 
(b) Money value of concessions, (c) Basic wage, (d) 
Cash Allowances, and (e) Arrears. 


2— Per capita average annual earnings collected under the 
Mines Act (by the Chief Inspector of Mines). 

3—Earnings of workers in plantation industries. 

4— Average annual earnings of certain type of staff working 
in Government Railways, Docks and some ad hoc figures. 
relating to nationalised road transport. 

5—Wages of working journalists. 


6—Minimum wages fixed or revised under the а 
Wages Act of 1948. 

7—Average wages of casual agricultural labour. 

Upto 1958, the Census of Manufacture and SSMI contained 
wage statistics. Now Annual Survey of Industries gives such data. 
Under the ASI Workers are classified as skilled, semi-skilled and 
unskilled and separate figures are collected for each category of 
workers. 1 


Agricultural Wage Statistics 


The position with regard to agricultural wage: statistics is far 
from. satisfactory. Only a few states have collected. some such: 
data by conducting five-yearly wage survey. In 1949 the Technical 
Committee gave valuable suggestion in this connection, and. they 
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гате being implemented by the D. E. S. Ministry of Food and 
Agriculture. The Committee suggested following classification :— 


A—Skilled Labour :— 
(i) Ironsmith 
(ii Cobbler 
(iii) Weavers 
B—Field Labour 
(i) Plough man 
(ii) Sowers 
(ii) Transplanters 
(iv) Weeders 
(v) Reapers 
C— Other agricultural Labourers 
(i) Coolies 
(ii) Load Carriers 
(ii) Well diggers 
D—Herdsmen 
Agricultural labour is classified into men, women and 
children. They are paid in cash as well as in kind. The 
remuneration given in kind is converted in its cash value. For 
-every district a village is selected as a sample village, and the wage 
prevelant there is taken for the district as a whole. Such data 
-are published in 
I—Agricultural Situation in India (Monthly) 
II—Agricultural Wages in India (Annual) 
The Agricultural Labour Enquiry Committee in its three 
enquiries has collected sufficient data on this problem. The 


Central Ministry of Agriculture has also collected data regarding 
this, with the help of NSS. 


(C) Cost of Living Statistics 


Figures relating to cost of living factory workers are collected 
by the Labour Bureau for a number of centres, and also by the 
labour departments of various states. The Labour Bureau 
publishes consumer price indices for the following centres in 
„different states :— 


Assam: Gauhati, Silchar, Tinsukhia 
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Bihar: Jamshedpur, Jharia, Dehri-on-Sone, Monghyr 
Maharashtra: Akola 
Delhi: Delhi 
М. P.: Jabalpur, Bhopal, Sator 
Madras & Kerala: Plantation centres (four centres), 
Mysore: Mercara 
Orissa: Cuttack, Berhampur 
Punjab: Ludhiana 
Rajasthan: Ajmer, Beawar 
W. B.: Kharagpur. 
Besides, the following are the important cost of living index 
numbers published by States’ Labour Departments :— 
1—Bombay Working Class cost of living index number. 
2—Kanpur working class cost of living index number. 
3—Working class Consumer price index Indore. 
4—Working class consumer price index, Patiala. 
| 5—Working class consumer price index, Patna. 
6— Working class consumer price index, Calcutta. 


(D), Trade Union Statistics 


The Labour Bureau collects and publishes data relating to 
trade unions in the country. The scope and coverage of these 
data are however limited, because all the trade unions are not 
registered bodies. Even for those trade unions which are 
registered, figures are not comparable, because industrial classi- 
fication has not remained unchanged. The statistics relate to the 
following points :— 


(i) Number of registered trade unions and membership 
of the Unions submitting returns. Membership figures 
are given sex-wise. Average membership per union is 
also computed. State-wise figures are also available 
separately. These figures are classified according to 
industries. 


(ii) Trade Union finances—this relates to sources of income 
and various items of expenditure of the registered 
unions. 


т. 5.—9 
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(E) Industrial Disputes Statistics 


The Labour Bureau collects All India data regarding 
industrial disputes resulting in stoppage of work. Under it both 
strikes and lock-outs are covered. The data relate to :— 

(i) Number of workers involved directly or indirectly. 

(i) Number of mandays lost. 

(11) Industry-wise classification of disputes. 

(iv) Disputes classified by cause. 

(v) Olassification of terminated industrial disputes by 
results. 

The Labour Bureau also publishes statistics relating to number 
of: (a) Workers Committees, (b) Production Committees and 
(c) Joint Committees, in different industries Such data are 
available industry-wise and state-wise. 


(Е) Social Security and Labour Welfare Statistics 


There are à number of Acts under which social security 
measures are. available to labour in India. Data are published 
with regard to benefit given to labourers under such acts. Some 
of them are given in brief below :— 

(i) Workmen Compensation Act, 1923 :—The Labour Bureau 
publishes data regarding (a) the number of injuries for which 
“compensation was paid (b) the amount of compensation paid. 
"The figures are published state-wise. Such data are collected by 
the States, and they send them to the Labour Bureau for final 
publication. 
| (ii) Employees State Insurance Scheme :—Data relating to 
“the working of the ESI scheme are published in the Indian Labour 
Year Book, in the following details :— 


€ 


(a) Rates of weekly contribution of employees. 


(b) Rates of benefits applicable under various types of 
disablements. 


(c) Dependents benefits. 

(d) Medical benefits. 

(e) Cash benefits paid during the year under various heads. 
(f) Areas where the ESI scheme has been enforced. 

(g) Number of employees covered under the Act. 
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(h) Sources of income and items of expenditure of the 
ESI Fund. 

(iii) Maternity Benefit Act :—The Labour Bureau publishes 
state-wise figures regarding women—(a) who. claim maternity 
benefits, (b) who were paid maternity benefits in full or in part. 
(c) total amount paid. 

The Labour Bureau also publishes data regarding Employees 
Provident Fund Act, Coal Mines Provident Fund and Bonus 
scheme Act etc. 

Recent steps Under the Collection of Statistics Act 1953 
the Ministry of Labour and Employment have drafted two separate 
sets of rules namely—(i) Collection of Statistics (Labour) Central 
Rules and (ii) Collection of Statistics (Labour) State rules. 
Under these rules data can be collected regarding :— 

(i) Prices of commodities, (ii) Attendance, (ii) Living 
conditions including housing, water supply and sanitation, (iv) 
Indebtedness, (v) Rent of dwelling houses, (vi) Wages and other 
earnings, (vii) Provident and other funds provided for labour, 
(viii) Benefits and other amenities provided for labour, (ix) 
“Hours of work, (x) Employment and unemployment, (xi) 
Industrial and labour disputes, (xii) Labour turnover and 
(xii) Trade unions. 


AN APPRAISAL OF LABOUR STATISTICS IN INDIA 


The International Labour Organisation have laid down in 
ibroad limits certain essential types of labour statistics to be 
compiled in every member country. It is for the Government 
of the member country to set up necessary statistical organisation 
for the purpose. I. L. O. have established broad standards for 
collection of statistical data pertaining to labour problems, so that 
international comparisons may be made. It is also desired that 
the data regarding labour may come to certain standard. The 
Т. Г. О., therefore laid down that labour statistics will be collected 
under following broad classification :— 

1—Classification of labour industry-wise and occupation-wise. 


2— Statistics of employment and unemployment. 
3—Wages and hours of work. 

4—Levels of living. 

5—Family living. 
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6—Statistics of injuries. 
7— Statistics of trade unions, disputed etc. 
Shortcomings of labour statistics in India : 
Я 1—Employment statistics are very poor. There are no accurate 
and reliable statistics regarding unemployment. 
2—Data on wages are also highly inadequate. 
3—There are absolutely no statistics regarding productivity. | 


4—Labour statistics generally covers industrial. labour, there 
are no proper data on other types of labour. 


CHAPTER 12 


GENERAL CRITICISM OF 
INDIAN STATISTICS 


The statistical data in India are generally а by-product of 
administration. "Therefore major bulk of the statistical material 
available in India comprises of official statistics. | Non-official 
statistics obtained in the country are meagre and in majority of 
cases remain unpublished. The non-official statistics, therefore, 
cannot be said to be complete and reliable as the agencies for 
collection and the sources of their information cannot be always 
depended upon. Though in recent years a few agencies like 
N.C.A.E.R. are doing commendable work in the field of statistical 
studies. 'The results of other agencies working in the country 
cannot be said reliable, because they have affiliation with certain 
institutions. Аз regards official statistics, attempts have been made 
in recent years to collect the official statistics on scientific lines, 
yet much is left to be done. Great caution is required in using 
even the official statistics. The main shortcomings of the statistical 
material available in the country are as follows :— 

1—Inadequacy of data—The statistical data collected in India 
are inadequate as they are not collected or classified under expert 
guidance nor with an object suiting the requirements of the public. 
The Indian Economic Enquiry Committee 1925 drew attention to 
the inadequacy of Indian statistical material in these words. 

“For the purpose of determining in what respects the statistical 
data available are deficient from economic point of view, the 
subject may be considered under the following three, main 
classes :—” 

(i) General statistics other than production comprising 
Finance, Population, Trade, Transport. and Communications, 
Education, Vital Statistics and Migration. 

(ii) Statistics of production, including Agriculture, Pasture 
and Dairy-Farming, Forest, Fisheries, Minerals, Large scale 
Industries, Cottage and Small-scale Industries. 
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(iii) Estimates of Income, Wealth etc., Income, wealth, Cost 
of living, Indebtedness, Wages and Prices. 

The statistics falling under (i) are more or less complete and 
adequate, those under (ii) are partially complete and statisfactory 
and partially inadequate and wanting in many respects while those 
under (iii). are highly unsatisfactory. 


The Bowley Robertson Committee 1934 made the same 
remarks. After independence the national Government took special 
interest in the development of statistical material in the country, 
because they realised that their schemes of planning could not 
succeed without a complete knowledge of facts and figures. The 
C.S.O. is doing a commendable work in this direction. The N.S.S. 
is making available data relating to the various sectors of Indian 
economy. Today we find that we have considerably advanced in 
matters of statistics. There are a few gaps here and there still to 
be filled and very soon we will be able to make out this deficiency 
and Indian statistical material will become at par with the advanced 
countries of the world. 


2—Inaccuracy of data—The accuracy of Indian statistical 
material is questionable. Firstly because the agencies employed for 
the collection of primary data are hardly trustworthy. For instance 
agricultural statistics are collected by Patwaris who are not 
technically trained for the job. Similarly enumerators engaged for 
census job are not interested in their work due to honorary nature 
of work. Secondly, scientific methods are very rarely applied to 
the analysis of primary data. In recent years the Government have 
taken steps to appoint trained personnel and also to evolve scientific 
methods'of collection and analysis of data. Such steps will go a 
long way in removing the long standing charge of inaccuracy in 
Indian statistics, 


3—Inconsistency and incompleteness—This is another defect 
of the Indian statistics, Incompleteness in the coverage has been 
considerably diminished since independence but incompleteness in 
scope is virtually the same, Steps are being taken by the Central 
and State Governments to overcome this shortcoming. There is 
also no uniformity in the method of collection, classification, defini- 
tion of Statistical units, making the data incomparable. There 
is need of standardising these things. The Government of India 
are laying down a set of uniform definitions, rules and procedures 
etc. in order to bring uniformity in the data. 
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4— Lack of Co-ordination—Indian statistics lack proper co 
ordination. "Though Central Statistical Organisation has been set 
up with this end in view, but still there is duplication and overlapp- 
ing in the collection of data. There is need for more co-ordination 
between central and state Governments statistical organisations. 
Lack of co-ordination results in waste of energy and money, and 
makes it difficult to consolidate the data. 


5—Not self-explanatory—Another shortcoming of the official 
statistics is that the exact significance, scope and methods of com- 
pilation are not widely known, and therefore they are not self- 
explanatory however, the office of the Economic Adviser tries to 
make up this deficiency and it has published three volumes of 
‘Guide to Current Official Statistics’ and ‘Indian Food Statistics’. 
In fact now-a-days, most of the statistical publications are giving 
explanations regarding the collected statistics. 


6—Lack of proper analysis—Indian statistics are not properly 
analysed and processed. They are collected to suit the administrative 
needs and therefore are analysed keeping in view the administrative 
requirements. Economic principles and considerations аге 
disregarded. 1 


7—Poor Coverage—The coverage of Indian statistics is also 
very poor. Before independence statistics were available only of 
that part of the country known as British India. This defect is no 
more found, and there has been extension in the scope and coverage 
of Indian statistics. 


8— Delay т publication—There is very often inordinate delay 
in the publication of the statistical material. Ву the time figures 
are published they become quite obsolete and out of date and much 
of their usefulness is lost. Some delay is bound to occur in the 
very nature of things but sometimes the delay becomes scandalous. 
Something must be done to remove this defect. 


.9—Inadequate Publicity—Generally statistics concerning with 
public welfare are not properly publicised in the public by means 
of diagrams, charts etc. Many useful data can educate the public 
opinion. 

It is gratifying to note that these defects are gradually 
disappearing. The shape of Indian statistics now has completely 
changed and the progress made in this direction has recently been 


tremendous. 
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Difficulties in the collection of statistics in India 


1—Mostly Indian people are illiterate. They do not under- 
stand the value of statistics. They do not keep any records etc. 
so that the information is not correctly gathered. 


2—Due to illiteracy and long foreign domination. Indian 
people specially of the rural areas are afraid of any Government 
enquiry They do not co-operate with the enumerators in 
furnishing the required information correctly. 


3— The country being a big country it becomes difficult to 
collect statistics from places far apart on an all India basis. Due 
to lack of transport means this becomes all the more difficult. 


4— India is a land of diversities, There are many languages 
which are spoken by the people. "There is great diversion in the 
economic status of the people. Their social customs differ. Due 
to these diversities sample study becomes difficult. 


S—There is indifference on the part of collecting as well as 
information supplying agencies regarding collection of statistics. 


SUGGESTIONS FOR IMPROVEMENT 


The following suggestions are made for improvement of the 
Statistical material :— 
1—The Government should collect more statistics of produc- 


tion, income, wealth, cost of living etc. and improve the quality 
of existing information on other matters, 


2—The staff employed for collecting data should be well- 


trained so that the information collected may be accurate and 
reliable. 


3—There should be more co-ordination not only between 
central and state departments but also with outside agencies. 


4— The information should be presented in a suitable form 
to be easily understandable to the public at large. Definitions, 
explanations and limitations should always be attached to make 
it simple and intelligible. 


5—In order to reduce expenses the unimportant and un- 
necessary details should be omitted. More use should be made 
of charts and graphs. 


6—There should be more arrangements for both theoretical 
and applied side of research in Statistics. In this work co-operation 
with the research institutes and universities should be sought for. 
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7—There should be standardisation of statistical definitions 
and methods. 


8—The data should be made available to the public as soon 
as possible. If information.cannot be published in full without 
causing delay, it should be published in parts... There should be 
decentralisation in the publication of data. 


9—The Government should encourage non-official agencies 
in the collection of statistics. 

10—The last but not the least, co-operation of trade and 
industry should be sought. 

Hence the statistical system and data of our country needs 
certain changes in the interest of better quality and bigger 
quantum of Statistics. The present state of affairs though have 
become much better than what it was formerly is still chaotic and 
needs further improvement. 'The sooner improvement takes place 
better it would be. 


QUESTIONS 


l. "Statistics in India have largely originated as a by-product .; 
of administrative activities. Examine this statement in the light 
of recent developments in Indian Statistics. (M. Com. В. Н. О.) 


2. ‘Statistics in India are neither complete nor reliable’. 
Assess the correctness of this statement. (M. Com. Agra) 


3. Describe the organisation and functions of the Centrat ` 
Statistical Organisation (GSO) in India. (M. Com. Raj.) 


4. Describe briefly how the census of population is conducted: 
in India. What important facts have been brought to light by 
the 1961 census. (B. Com. B. H. U.) 


9. Enumerate the items on which information was collected: 
in the 1961 census of India. Point out the principal modifications. 
introduced in the 1961 census schedule as compared to the schedule; » 
of the previous census, (М. Com. В. Н. U.), 


‚ 6. ‘Census is not merely the counting of heads but it also 
gives a fund of other valuable information. Comment on this 
statement in the light of 1961 census. (В. Com. Agra) | 


7. What is the method current in India of collecting agri- 
cultural statistics of area and yield ? Express your opinion about 
the accuracy of the method employed. (B. Com. Luck.) 

8. Write a short essay on ‘Industrial statistics in India’. 

(В. Com. В. Н. U.) 

9. What type of statistical data are available with regard to: 
the foreign trade of India? Describe the method of their 
collection and the extent of their accuracy. (B. Com. B. H. U.) 


10. Write a brief critical note on the aims and achievements: 
of the National Sample Survey of India. (M. Com. Raj.) 


‚ 11. What are the Special problems of National Income 
estimation in India? Describe briefly the various methods 
followed for the calculation of Indian income. (M. Com. Alld.) 


12. Write a note on 'statistics of trade in India’. Discuss 
the recent changes introduced by the D. С. C. I. & S. in the 
publication of these statistics. (B. Com. Raj.) 


00018. What is meant by Census of Production? Give a 
Critical account of the statistical information collected under the: 
Industrial Act. (M. Com. Raj.) 

‚14. What are the important price index numbers prepared in 
India? Explain the mode of construction of any one of them. 
(M. Com. B. H. U.) 

15. Point out the defects of the statistics of Industrial 
Production available in India. Suggest ways to improve them. 
(M. Com. Alld.) 

16. What do you understand by the term ‘Indian Agricultural 
Statistics’? Outline their shortcomings апі give concrete 
Suggestions to remedy them. (M. A. Raj.) 
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